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Preface 


When we set out to write this book in the spring of 2011, the only introduction to 
reverse mathematics was Simpson’s Subsystems of Second Order Arithmetic [288]. 
Our motivation was to write a complementary text, rather than a replacement. We 
planned on a more introductory treatment, necessarily less encyclopedic, that would 
offer a computability theoretic approach to the subject along with recent examples 
from the literature. But, almost as soon as we began writing, reverse mathematics 
started changing in fairly dramatic ways. 

Ideally, books in active research subjects should be written during lulls. A subject 
will grow rapidly, slow down, expand again, and then after a while, perhaps, reach a 
period of stability. New ideas and results continue to emerge, but not ones that upend 
the subject by suddenly changing its scope or direction. This is when it makes sense 
to write a book. If the authors are very lucky, they may even finish it before the next 
major expansion begins. 

We were not so lucky. The new expansion was major indeed and did not stop for 
many years. There was a sudden infusion of ideas from computable analysis centered 
around Weihrauch reducibility. There was the introduction of powerful new tools, 
such as preservation techniques and the probabilistic method. And there was a steady 
buildup of important new results, including the resolution of many longstanding 
open problems. Collectively, these developments reshaped and redefined reverse 
mathematics. In this way, our original conception began to seem less relevant. 

And so we gradually aimed the project in a different direction, and reworked it 
from the ground up. In particular, we are happy to present many new developments 
in the subject for the first time in book form, and we hope that the overall outcome 
better reflects the state of the subject today. Ultimately, our main goal for this book 
is to convey just that, and thereby to provide a springboard for reading papers in 
reverse math and, for those interested, for doing some, too. 

As mentioned, our treatment is based in computability theory. This has the dis- 
advantage of making the contents less accessible as compared to the more syntac- 
tic treatments of Simpson and others, which only rely on a basic background in 
logic. But computability and reverse mathematics are naturally complementary—as 
Shore [282] put it, there is “a rich and fruitful interplay” between the two—and 
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this interaction has only become stronger over time. Indeed, concepts and results 
from all parts of computability are increasingly finding applications to problems in 
reverse mathematics. This makes a contemporary account of the subject that omits 
or obscures the computability much more difficult, and much less useful. The com- 
bined perspective we have adopted treats reverse mathematics as overlapping with a 
large chunk of computable mathematics, Weihrauch style analysis, and other parts 
of computability that have become truly integral to the work of most researchers in 
the area. 

We assume a basic background in logic, as covered in standard first year grad- 
uate courses following texts like Enderton’s [93] or Mileti’s [210]. For the reasons 
discussed above, a previous introduction to computability theory (e.g., Soare [295] 
or Downey and Hirschfeldt [83]) will undoubtedly be helpful. Chapter 2 provides an 
overview of the subject that can serve as a standalone introduction or refresher. Our 
notation will for the most part be standard, and is summarized in Section 1.5. 

This book is still by no means meant as a replacement for Simpson’s text, which 
includes many examples and results that we have chosen to omit. More generally, 
we have tried as much as possible to avoid duplicating material that is already well 
covered elsewhere. There are now a number of other texts focusing, in whole or in 
part, on reverse mathematics. These include Slicing the Truth by Hirschfeldt [147]; 
Calculabilité by Monin and Patey [219]; and Mathematical Logic and Computation 
by Avigad [8]. We recommend all of these as excellent companion texts. Stillwell’s 
Reverse mathematics: proofs from the inside out [304] also provides a nice introduc- 
tion to the subject aimed at a more general audience. We will mention a number of 
other references throughout the text. 

The content of this book is organized into four parts. Part I includes the aforemen- 
tioned background chapter on computability theory; a chapter on instance—solution 
problems, which are a main object of study in computable mathematics; and a chap- 
ter developing various reducibilities between such problems, including computable 
reducibility and Weihrauch reducibility. Part II introduces second order arithmetic 
and the major subsystems used in reverse mathematics. This is followed by a chapter 
on induction and other first order considerations, which are covered in other texts 
in more generality but deserve a concise treatment that makes them more accessi- 
ble to the reverse mathematics community. We then move to a chapter on forcing 
in computability theory and arithmetic, with applications to conservation results. 
These first two parts lay out the bulk of the general theory of reverse mathematics, 
with the remainder of the book dedicated to specific case studies. 

Part III focuses on the reverse mathematics of combinatorics, with one chapter 
dedicated exclusively to Ramsey’s theorem, and another to many other combinatorial 
principles that have been studied in the literature. Part [V contains chapters on the 
reverse math of analysis, algebra, and set theory, descriptive set theory and more 
advanced topics, including a short introduction to higher order reverse mathematics. 
Exercises at the end of each chapter cover supplementary topics and fill in details 
omitted for brevity in the main text. 

In terms of how to read this book, Parts I and II stand largely apart from Parts [II 
and IV. The reader looking to learn the subject for the first time should therefore start 
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in the first two parts, choosing specific chapters according to their background and 
interest. A reader already acquainted with computable mathematics and the reverse 
mathematics framework, on the other hand, can advance directly to the latter parts 
for an overview of research results, or to learn advanced techniques used in the 
reverse math of different areas. Dependencies between chapters are shown in the 
diagram below, where Chapter X — Chapter Y means that the material in Chapter 
Y is necessary (in substantial part) for Chapter X. 


Chapter 2 
Chapter 3 
A 

Chapter 1 Chapter 4 Chapter 7 Chapter 5 

A A 
Chapter 6 

A 
Chapters 10 to 12 Chapter 8 

A 
Chapter 9 


The way that ideas evolve and connect over time is messy. As a result, our 
presentation is not always chronological. This goes for results—e.g., anewer theorem 
sometimes being presented ahead of an older one—as well as broader themes. 
For example, we give an account of Weihrauch reducibility before second order 
arithmetic, even though the latter was developed earlier, and certainly was considered 
a part of reverse mathematics long before the former. In a similar spirit, outside of a 
section in Chapter | and some brief remarks here and there whose content is already 
well known, our account will be largely ahistoric. That said, there are we few places 
we could not resist the dramatic buildup offered by giving an account of how a 
theorem was arrived at, including prior partial results, failed attempts, etc. A good 
book should have some drama. 

We do not, however, wish to cause any drama with this book. We wholly admit that 
the presentation of the subject here, and the choice of topics included and omitted, 
reflects only a particular point of view, which is our own. There is certainly much 
more we wish we could have included. The fact is that reverse mathematics, as a 
scientific field, is now quite broad and multifaceted, and it is too tall an order to try 
to squeeze all of it into one book. We are after the core of the subject here, from our 
perspective. Simpson [288] has said of reverse mathematics, and of the foundations 
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of mathematics more generally, that it is the “study of the most basic concepts and 
logical structure of mathematics, with an eye to the unity of human knowledge”. 
Shore [282] has called it, somewhat less grandly, the “playground of logic”. It is 
both of these things: a foundationally important endeavor, and also one that is just 
plain fun. We hope this book conveys a bit of each. So with that, let’s go play. 
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Chapter 1 ®) 


Check for 


Introduction updates 


1.1 What is reverse mathematics? 


For most of its existence as a subject, reverse mathematics had a clear and unam- 
biguous definition as a program in the foundations of mathematics concerned with 
the question of which axioms are necessary (as opposed to sufficient) for proving 
various mathematical theorems. Which theorems this includes is a question in its 
own right. Simpson [288, p. 1] refers to theorems of “ordinary mathematics”, by 
which he means mathematics “prior to or independent of the introduction of abstract 
set theoretic concepts’. This means mathematics that does not principally involve 
sets of arbitrary set theoretic complexity—modern algebra, geometry, logic, number 
theory, real and complex analysis, and some limited topology (but certainly not all 
of any of these fields). 

The remarkable observation, due to Friedman, is that “When the theorem is 
proved from the right axioms, the axioms can be proved from the theorem”. This 
phenomenon appears with many theorems: once the theorem is proved from an 
appropriate axiomatic system, it is also possible to show that the axioms of that 
system are derivable from the theorem itself, over a weak base theory. In this sense, 
it is said that the theorem “reverses” to the axioms, giving us a possible etymology 
for the name “reverse mathematics”. The other possibility, of course, is that it is a 
play on “reverse engineering”, which is quite apt as well. 

The proof that the theorem implies particular axioms over a base theory is known 
as a reversal to those axioms. A reversal provides a measurement of a theorem’s 
axiomatic strength: a theorem requiring a stronger system to prove is stronger than a 
theorem that can be proved in a weaker system. In particular, if a theorem T implies 
a set of axioms S over a base system, then any other proof of T in an extension of the 
base system must use axioms that imply S. This gives a methodological conclusion 
about the minimum axioms needed to prove the theorem. 

Reverse mathematics traditionally, but not always, uses subsystems of second 
order arithmetic for these axiom systems. Second order arithmetic is useful for this 
purpose because of its intermediate strength: it is strong enough to formalize much of 
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“ ordinary mathematics”, but weak enough that it does not overshadow the theorems 
being studied in the way that set theory does. 

Over time, reverse mathematics has become more difficult to distinguish from 
computable mathematics, sometimes called applied computability theory. This 
should be distinguished from constructive mathematics, which aims to study math- 
ematical theorems using wholly constructive means, which do not include the law 
of the excluded middle or other nonconstructive axioms. Computable mathematics, 
by contrast, seeks to measure the constructive content of mathematical theorems (or 
lack thereof) using the rich assortment of tools developed in classical computability 
theory. A prominent point of focus is on theorems of the form 


(Vx) Lp(x) > (Ay)W(x, y)], (1.1) 


where x and y are understood to range over the elements of some ambient set, and y 
and w are properties of x and y of some kind. For obvious reasons, we will refer to 
statements of this form as VA theorems. 

When x and y range over numbers or sets of numbers, the theorem lends itself 
naturally to computability theoretic (or effective) analysis. Two commonly asked 
questions include the following. 


* Given a computable x such that y(x) holds, is there always a computable y such 
that w(x, y) holds? 

* Does there exist a computable x such that y(x) holds, and such that every y for 
which w(x, y) holds computes the halting set? 


As it turns out, more often than not, the answers to these questions (and others like 
them) are directly reflected in the axiomatic strength of the theorem, as measured 
using the frameworks of reverse mathematics. This is not an accident, but rather 
a consequence of the setup of reverse mathematics, particularly of how the formal 
systems involved are defined. For example, the base system RCApo typically used in 
reverse mathematics is often viewed as a formalization of computable mathematics. 
This makes it possible to directly translate many results from computability theory 
into reverse mathematics, and vice versa. The tie between computability theory and 
reverse mathematics is a key source of interest for many reverse mathematicians. 

To accommodate both the axiomatic and computability theoretic viewpoints, we 
will understand reverse mathematics in this book as denoting a program concerned, 
quite generally, with the complexity of solving problems. 

Intuitively, to solve a problem means to have a method to produce a solution to 
each given instance of that problem. For example: knowing how to factor composite 
integers means being able to produce a nontrivial factor of a given composite integer; 
knowing how to construct maximal ideals of nonzero unital rings means being able 
to produce such an ideal from a given such ring; and so forth. Depending on the 
problem and the instance, there may be many possible solutions, or a unique one. 
The task of “solving” a problem means be able to produce at least one solution for 
each instance, not necessarily to produce all of the solutions. 

Conceptually, perhaps the first evolution of the idea of solving a problem is the 
idea of solving it well. This can mean different things for different problems and in 
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different contexts. We all know how to factor integers, for example, but it would be 
quite a revolution if someone figured out how to do it quickly, say in polynomial time 
relative to the integer to be factored. As of this writing, the best algorithm for integer 
factorization is the general number field sieve, which is far slower. But this raises 
an equally interesting question: could this actually be the best possible algorithm? It 
is often the case that a mathematical problem is first solved somewhat crudely, and 
then over time the solution is refined in various ways, perhaps many times over. Is 
there a sense in which we could say that we have reached an optimal refinement, that 
we have found the “most efficient” solution possible? 

This is a much more encompassing approach than may appear at first glance. 
Most noticeably, perhaps, it covers V3 theorems, and with them a great swath of 
“ordinary mathematics”. In particular, for each a theorem having form (1.1) we may 
associate the problem “given x such that v(x) holds, find a y such that (x, y) 
holds”. Moving further, we can also consider problems about problems, and in that 
way accommodate an even broader discussion. It is entirely common, after all, to 
reduce one problem to another, or to show that solving two problems are equivalent 
in some sense. Even finding a proof of a theorem in a given formal system is an 
example of such a reduction, too: “reduce this theorem to these axioms, via a proof”. 
And if we think of the task of finding such reductions as problems in their own right, 
we can again ask about efficiency and optimality (e.g., is there a shorter proof? is 
there a more constructive one? etc.). 

We do not mean to suggest a single abstract framework for reverse mathematics, 
computable mathematics, and everything surrounding them. Far from it. But the 
view that we are interested in problems, and how difficult they are to solve, conveys 
a theme that will serve us well as motivation and intuition in the pages ahead. 


1.2 Historical remarks 


A full account of the history of reverse mathematics would require its own book. 
Here, we give a brief sketch of the origins and development of reverse mathematics 
from its prehistory, through its origins in the 1970s, and to the present. We mention 
a few contributions that we view as important waypoints in the development of the 
field, with no intention to minimize the many contributions not mentioned. The 
results presented in the remainder of this book help to fill out this historical sketch. 

Before reverse mathematics became a separate field, Weyl, Feferman, Kreisel, and 
others investigated the possibility of formalizing mathematics in systems of second 
order or higher order arithmetic, in the spirit of arithmetization of analysis. In the 
mid-20th century, early results in computable and constructive mathematics explored 
the ability to carry out mathematics effectively, as in the work of Specker [299] to 
construct a bounded, increasing, computable sequence of rational numbers whose 
limit is not a computable real number. In 1967, Bishop’s seminal Foundations 
of Constructive Analysis demonstrated the breadth of mathematics that could be 


4 1 Introduction 


studied constructively. For a much deeper account of this “prehistory” of reverse 
mathematics, see Dean and Walsh [64]. 

Reverse mathematics itself was at least partially inspired by an ancient problem: 
given a mathematical theorem, can we specify precisely which axioms are needed 
to prove it? Some two millennia ago, Greek logicians asked this question about 
theorems in Euclid’s geometry. A hundred years ago, similar questions were being 
raised about the axiom of choice, and about which fragments of set theory could be 
retained without it. Historically, this study of logical strength (of axioms, of theorems, 
of parts of logic or mathematics) has had profound foundational consequences, 
enhancing our understanding of our most basic assumptions and the most complex 
theorems that depend on them. 

Combining these antecedents, Friedman proposed a program to measure the 
axiomatic strength of mathematical theorems using various subsystems second order 
arithmetic as benchmarks. This program was originally introduced in a series of talks 
at the 1975 International Congress of Mathematicians and the 1976 Annual Meeting 
of the Association for Symbolic Logic. Friedman [108, 109] identified a particular 
collection of subsystems of second order arithmetic—the so-called “big five”— 
and a number of mathematical theorems equivalent to each over the weak base 
theory RCAo. In his 1976 PhD thesis, Steel [302] (working under the supervision of 
Simpson) presented another early example. Descriptions of the emergence of reverse 
mathematics are given by Friedman and Simpson [116] and Friedman [113]. 

In the 1980s, significant progress was made in formalizing theorems of “ordinary” 
mathematics into second order arithmetic and showing these theorems equivalent to 
one of the “big five” systems. Simpson, along with several PhD students, was key 
in developing reverse mathematics into an active research field. A principal focus in 
this period was expanding the field, which now included algebra, real and complex 
analysis, differential equations, logic, and descriptive set theory. A limited number of 
combinatorial results were also studied, including Hindman’s theorem and Ramsey’s 
theorem. This work, and much more, is well documented in Simpson [288]. 

A key motivation emphasized in the literature from this time is the philosophical 
and foundational information gained in reverse mathematics. In particular, when a 
seemingly powerful theorem is provable in a weak system, it shows that a person 
willing to accept the weak system (in some philosophical sense) should be led 
to accept the theorem in the same sense. The weakest three systems of the “big 
five’, in particular, have small proof theoretic ordinals and are generally viewed as 
predicative. 

By the late 1990s, reverse mathematics had drawn the attention of computability 
theorists. They realized that second order arithmetic provided a “playground”, in 
the words of Shore [282], where established computability techniques could be 
applied and new techniques could be developed. In particular, it was commonly 
noted at the time that the reverse mathematics results being obtained rarely required 
the complicated priority arguments that had become the norm in degree theory. In 
many cases, reverse mathematics proofs began to be written in a more informal, 
computability oriented manner. At the same time, researchers also began to look 
at implications that hold over w-models, rather than only at implications provable 
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in RCAo. Implications over w-models do not have the same foundational implications, 
but they have stronger ties to computability. 

A significant breakthrough related to combinatorics occurred near the end of 
the 1990s. Work of Seetapun [275] and Cholak, Jockush, and Slaman [33] showed 
that much more could be done with Ramsey’s theorem than had been realized. 
Seetapun’s work, in particular, showed the utility of effective forcing as a method to 
build models of second order arithmetic. This led to an explosion of results in the 
reverse mathematics of combinatorial principles. These techniques and results form 
a significant portion of this text. They also show that, although priority arguments 
never made a significant appearance in reverse mathematics (we will see a few in 
Chapter 9), the method of forcing turns out to be extremely applicable. 

Another development was the discovery of a close connection between reverse 
mathematics and computable analysis. The possibility of this was first suggested by 
Gherardi and Marcone [122], and later independently by Dorais, Dzhafarov, Hirst, 
Mileti, and Shafer [72]. The latter group rediscovered a reducibility notion originally 
described 20 years earlier by Weihrauch [324]. The programs of Weihrauch-style 
computable analysis and reverse mathematics had developed separately, with little 
to no overlap, until the deep relationships between them were suddenly realized. 
This unanticipated relationship opened many research problems for both reverse 
mathematics and Weihrauch-style computability. It also led to an adoption of several 
reducibility notions as tools within reverse mathematics. 

We view reverse mathematics broadly, as a program including all these influences 
and viewpoints. One common thread is the goal, when possible, to find the minimum 
resources needed to prove a theorem or solve a problem. This distinguishes reverse 
mathematics from many other programs of computable or constructive analysis, 
where the goal is to find a way to make theorems provable with a given framework, 
often by modifying the theorems or adding hypotheses. 

There are several branches of reverse mathematics and logic that pull in other di- 
rections, and which we will not discuss in detail. The program of constructive reverse 
mathematics studies the axioms needed to prove theorems within a constructive base 
theory (which may be informal); Diener and Ishihara [67] provide an introduction. 
The program of higher order reverse mathematics [185] uses third order or higher 
order arithmetic. This allows for theorems to be represented differently, and can 
help to avoid some of the coding needed merely to state theorems in the context 
of second order arithmetic. Recent work of Normann and Sanders illustrates the 
range of possible applications of higher order techniques. We survey some of these 
results in Section 12.4. Higher order reverse mathematics is also linked to the proof 
mining program of Kohlenbach [183]. Strict reverse mathematics, also developed 
by Friedman [112], uses a multisorted free logic instead of second order arithmetic, 
and attempts to minimize or eliminate coding and logical formalisms in order to 
remain “strictly mathematical’. In proof theory, there has been substantial work on 
subsystems of second order arithmetic, specifically including characterizations of 
the proof theoretic ordinals of many subsystems [250, 255]. 
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1.3 Considerations about coding 


We have explained that reverse mathematics results typically use methods from com- 
putability theory and second order arithmetic to study theorems and their associated 
problem forms. However, neither the fundamental notions of computability nor sec- 
ond order arithmetic refer directly to the mathematical objects we wish to study, 
such as groups, fields, vector spaces, metric spaces, continuous functions, ordinals, 
Borel sets, etc. Therefore, it is necessary to code these objects in ways that are more 
amenable to our analysis. We need to represent the objects of interest with more 
basic objects such as natural numbers or sets of naturals. 

The first consequence of this is that classical reverse mathematics can only work 
with miniaturizations of many theorems of “ordinary mathematics”, meaning ver- 
sions formulated in terms of countable or separable objects. Thus we can only 
consider theorems about countable groups, complete separable metric spaces, and 
so forth. The second consequence is that, formally, we only ever work with repre- 
sentations or codings of mathematical objects, rather than those objects directly. 

This leads to a key challenge in reverse mathematics. Some coding systems, such 
as the method for representing continuous functions, are highly nontrivial. But the use 
of nontrivial coding systems leads to a question of how much our results reflect the 
strength of the original theorems (or at least, the strength of their miniaturizations), 
and how much the results are influenced by the specific choices of coding system. 
(Eastaugh and Sanders [92] have termed this coding overhead.) Just as the observer 
effect is a fundamental limitation for physics, the need for coding is a fundamental 
limitation of reverse mathematics.While higher order reverse mathematics and strict 
reverse mathematics attempt to reduce the coding, some amount of coding in the 
broadest sense is inevitable. 

Because coding cannot be avoided when formalizing theorems, the use of non- 
trivial coding systems leads naturally to questions about the optimality of the codings 
being employed. It is difficult to find mathematical criteria to determine which cod- 
ing system is the best. This problem—to find a rigorous way to compare coding 
systems—is not itself a topic of reverse mathematics, but it is sometimes considered 
an open problem in the broad spirit of foundations of mathematics. Progress on this 
problem through mathematical methods seems to require a breakthrough in new 
ways to formally compare coding systems. 

In practice, the choice of coding systems comes down to a combination of fac- 
tors we describe as utility, directness, fullness, and professional judgment. These 
are not formal, mathematical criteria, but they nevertheless illustrate the ways that 
researchers in the literature evaluate and choose coding systems. 

The utility of a coding system can be seen in the results it allows us to formalize 
and analyze. This is an aesthetic judgment, of course. But, in the end, the purpose of 
a coding system is to allow us to obtain results, and a system that allows few results 
to be obtained will be of correspondingly little interest. Utility can also present itself 
as uniformity. For example, the coding of real numbers with quickly converging 
Cauchy sequences leads to the operations being uniformly computable. The coding 
of continuous functions allows us to evaluate and compose functions in RCAo. 
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By directness, we mean the close relationship between the coding used and what is 
being coded. For example, there is significant directness in representing a countably 
infinite group by numbering the elements and representing the group operation as a 
function from w X w to w. On the other hand, our coding for continuous functions 
is arguably not as direct as simply coding the function’s value for each element in a 
dense subset. But that alternate system makes it much more challenging to compose 
functions, lacking utility. 

Directness also relates to the inclusion (or not) of additional information in the 
coding. For example, we can ask whether continuous real-valued functions on [0, 1] 
are coded in a way that includes a modulus of uniform continuity. It is common in 
reverse mathematics to attempt to minimize the amount of additional information 
carried directly in the coding. This allows us to analyze the difficulty of obtaining that 
information from the coding, and also allows us to include the additional information 
in the hypothesis of theorems when we choose. 

The other side of directness is whether the coding somehow removes information 
that would have been available with the original objects. In constructive mathematics, 
it is common to include a modulus of continuity with each continuous function. In 
that setting, it is assumed that a constructive proof that a function is continuous 
would demonstrate the modulus. 

Fullness is the property that every object of the desired type has a code of the 
desired type in the standard model. This is a vital property for a coding system, 
as it shows that our formalization does not exclude any actual objects of interest. 
For example, every real number is the limit of some quickly converging Cauchy 
sequence of rationals, and every open set of reals is the union of a sequence of open 
rational intervals. Thus, when interpreted in the standard model, a theorem referring 
to coded objects like these retains its original scope. Of course, nonstandard models 
may have additional codes for nonstandard objects, and submodels of the standard 
model may not include codes for all objects. 

One example of a system that lacks fullness comes from certain schools of 
constructive analysis, where a real number is coded as an algorithm or Turing 
machine that produces arbitrarily close approximations to the number. This coding 
limits the set of coded reals to the set of computable real numbers. Some theorems 
from ordinary mathematics are thus false in this framework, such as the theorem that 
every bounded increasing sequences of rationals converges. 

The final test is professional judgment, which includes an aspect of opinion but 
cannot be disregarded. As with any field, the choice of which problems to study 
and which frameworks to use is not purely mathematical, but includes a significant 
aesthetic element. 

When multiple coding systems are possible, there is an opportunity to compare 
the coding systems in various ways. This can lead to interesting mathematics on 
its own. An example in reverse mathematics can be seen in Hirst [158]. Weihrauch 
reducibility, with its focus on representations, gives particularly powerful methods 
to compare coding systems. For example, we can consider the problem of computing 
the identity function f(x) = x with the input and output in different representations. 
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In Section 12.4, we discuss work in higher order arithmetic that provides a different 
approach to examining coding systems. 

We will see numerous coding systems throughout the remainder of this text. The 
preceding comments provide a framework for the reader to consider whenever a new 
coding system is presented. In particular, although the term is not often used in the 
literature, fullness is a key consideration that all or nearly all of our coding systems 
will possess. 


1.4 Philosophical implications 


In the wake of the foundational crisis of mathematics in the early part of the 20th 
century, the mathematical community saw the emergence of a controversy between 
several schools of thought about the philosophy of mathematics and mathematical 
practice. Among these were formalism, championed by Hilbert, and various forms 
of constructivism, including the version of intuitionism championed by Brouwer. 
Attempts to reconcile these opposing views ultimately led Hilbert in the 1920s to 
formulate what is now called Hilbert’s program for the foundations of mathematics. 
This included as a key tenet the existence of a formalization wherein mathematics 
could be proved to be consistent using wholly constructive “finitistic” means. Any 
hope of completely realizing Hilbert’s program was famously dashed a decade later 
by Gédel with his incompleteness theorems. However, Simpson [286] and others 
have argued that formalizations of theorems in weak subsystems of second order 
arithmetic provide a partial realization of Hilbert’s program. 

The program of predicativism, advocated by Poincaré and Wey], was also influen- 
tial. This program seeks to formalize mathematics in ways that avoid impredicative 
definitions. An example of an impredicative definition is the definition of the least 
upper bound of a nonempty bounded set A of reals as the smallest element of the set 
of all upper bounds of A. In contrast, one predicative construction of the least upper 
bound begins by forming a sequence of rationals (r;), each of which is an upper 
bound for A, so that there is an element of A within distance 2~' of r;, and then 
forming the least upper bound as the limit of the sequence (r;). This construction 
produces the least upper bound in a way that only quantifies over the rationals and 
the set A (which we are given from the start) rather than the set of all real numbers. 

Each of the “big five” subsystems of reverse mathematics has a philosophical 
interpretation, based on the proof theoretic ordinal of the system, and broader expe- 
rience with the system. 


¢ RCAg has an interpretation as effective (computable) mathematics, including the 
law of the excluded middle. 

¢ WkKLo has an interpretation described by Simpson as finitistic reductionism. 
While WkKLg is not finitistic, it is conservative over the finitistic system PRA for 
18h formulas. Thus, even though a person embracing finitism might not accept 
WKLo as meaningful, they should so accept a proof of a 8b sentence in WKLo. 
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ACAvg has an interpretation as (a portion of) predicative mathematics. This system 

is conservative for arithmetical formulas over first order Peano arithmetic. 

¢ ATRo has an interpretation via predicative reductionism. Although ATRo is gen- 
erally viewed as impredicative, it is conservative over a predicative system IR of 
Feferman (see [115]). 

¢ Theorems requiring TI}-CAo or stronger systems are more fully impredicative. 


Overall, these systems form a small portion of a large hierarchy of formal systems, 
ranging from extremely weak theories of bonded arithmetic through large cardinals 
in set theory. Simpson [289] call this the Godel hierarchy. A more recent account 
can be found in Sanders [267]. 

Beyond the foundational aspects of each of the “big five” subsystems lie consid- 
erations about the “big five” phenomenon itself. While its historical prominence is 
indisputable, the emergence of more and more counterexamples to this classification 
over the past twenty years—resulting in what is now called the reverse mathematics 
zoo (see Section 9.12)—raises questions about the nature and significance of this 
phenomenon. There is no a priori reason one should expect a small number of axiom 
systems to be as ubiquitous and (very nearly) all encompassing as the “big five”. Nor 
is there any reason the systems that appear should necessarily be linearly ordered in 
strength, as the “big five” subsystems are. 

Slaman (personal communication) has suggested the phenomenon is an artifact 
of the choice of theorems and available techniques; as the subject expands and 
techniques develop, the phenomenon will grow less pronounced. Eastaugh [91] 
provides a different pushback against the standard view of the “big five” phenomenon, 
arguing that it is better understood in terms of closure conditions on the power set of 
the natural numbers. In the other direction, Montalban [220] proposed that the “big 
five” subsystems owe their importance to being robust, meaning invariant under 
“small perturbations”. This makes a plausible case for why, when a theorem is 
equivalent to one of the “big five” subsystems, so are its natural variations and 
elaborations. As for linearity of the subsystems, and the seemingly hierarchal nature 
of reverse mathematics, Normann and Sanders [234] and Sanders [272] have argued 
that this is a consequence of the choice of working in second order arithmetic; see 
Section 12.4 for a further discussion. 

Another interesting question concerns the growing collection of exceptions to the 
“big five” phenomenon, and the curious fact that they come predominantly from 
combinatorics. As noted in [31], computability theoretically natural notions tend 
to be combinatorially natural, and vice versa. This makes combinatorial theorems 
interesting from the point of view of computability, and hence also reverse mathe- 
matics. One possibility for why so many theorems from combinatorics (particularly, 
extremal combinatorics) fall outside the “big five” is that they require little to no 
coding to formalize in second order arithmetic. Indeed, codings often directly fea- 
ture in the reversals of theorems from other areas—even when, with reference to 
Section 1.3, the coding overhead is minimal. In effect: with simple mathematical 
objects, there is not much to code, but also not much to code with. 

A different take is that combinatorics itself may play a special role within reverse 
mathematics. There is the notion in mathematics of the “combinatorial core” of 
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a theorem, which is usually understood to be an amalgam of the combinatorial 
properties needed to work with and manipulate the theorem. As pointed out by 
Hirschfeldt [147], a common perspective is that it is the “combinatorial core” that 
the analysis of a theorem in reverse mathematics actually reveals. For this reason, 
equivalent theorems (in the sense of reverse mathematics) are sometimes said to 
have the same “underlying combinatorics”. Of course, on this view, it would stand to 
reason that combinatorics offers the widest assortment of theorems with “distinct” 
combinatorial cores. 

Which of these questions are most compelling, and which proposed solution, if 
any, holds the most explanatory value, are of course primarily for philosophers to 
discuss. We are not philosophers, and will generally focus on the mathematical side of 
reverse mathematics, without drawing philosophical consequences in the remainder 
of the text. But we consider these types of reflections important, and encourage the 
reader to keep them in mind (at least in a general sense) while working through this 
book. There is room for much philosophical analysis of reverse mathematics—far 
beyond the smattering of topics presented above—and this analysis would benefit 
from open and thoughtful participation by mathematicians. Conversely, being aware 
of some of these philosophical considerations can only help better inform and ground 
the mathematician working in the subject. 


1.5 Conventions and notation 


We let w denote the set of natural numbers, {0, 1,2, ...}, as well as the usual ordinal 
number. Thus, we may write n € w and < w interchangeably. For a set X C w 
and n € w, we write X [n for the set {x € X : x < n}. (Thus, X [0 = @.) For 
convenience, we implicitly assume a mild typing wherein the elements of w are 
distinguished from sets and other set theoretic constructs. So, for example, @ should 
never be confused with 0 € w, etc. 

For subsets X and Y of w we write X < Y as shorthand for max X < minY. For 
x € w, we write x < X or X < x as shorthand for x < min X and maxX < x, 
respectively. We write X C* Y if X \ Y is finite, and say X is almost contained in Y. 
We write X =* Y if X C* Y and Y C* X; in this case X and Y are almost equal. For 
finite sets X, we write |X| for the cardinality of X. 

The principal function of an infinite set X = {xp < x; <---} C wis the function 
Px: w — X such that px(n) = x, for all n; thus px enumerates X in increasing 
order. Given two functions f, g: X — w, we say f dominates g if f(n) > g(n) for 
all sufficiently large n. 


Definition 1.5.1 (Finite sequences). Fix X C w. 


1. A finite sequence or string or tuple in X is a function a: w fn — X for some 
n € w. The length of a is n, denoted by lh(@), or more commonly, |@|. We write 
@ = (xo,...,Xn-1) if a) = x; for alli <n. 

2. The unique string of length 0 is denoted (). 
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3. If a, B are finite sequences with |a| < || and a(i) = B(d) for alli < k, then a 
is an initial segment of B, and f is an extension of a, written a < B. If a # B, 
then a is a proper initial segment of £, written a < B. 

4. If a, 8 are finite sequences then the concatenation of a followed by B, written 
af or a~ £, is the string y of length |a| + |8| with y(i) = a(Z) for alli < |a| 
and y(|a| +i) = B() for all i < |B|. If |8| =n and B() = n for all i < n, we 
write simply ax” or a~ x”. 

5. If |a| > 1 then a* denotes a | |a|— 1, i.e., a with its last element (as a sequence) 
removed. In particular, for every x € w, (ax)* =a. 

6. The set of all finite sequences in X is denoted by X<“, the set of all finite 
sequences of length less than n by X<", and the set of all finite sequences of 
length exactly n by X”. If X = {0,...,k — 1} for some k € w, we write k<®, 
k<", and k” instead. 


We typically use lowercase Greek letters to denote elements of X<“. In general, 
these are letters near the beginning of the alphabet (a, 6,y,...), but when X = 
{0,...,k — 1} for some k € w we usually use letters near the middle and end of the 
alphabet (0, 7, T, .. .). For elements of X”, we sometimes also use the notation x, y, 
etc., and when n is fixed but unspecified we may abuse notation slightly and write, 
e.g., x € X instead of x € X”". Elements of 2<“ are called binary strings. 


Definition 1.5.2 (Infinite sequences). Fix X C w. 


1. An infinite sequence in X is a function f: w — X. The set of all infinite 
sequences in X is denoted X”. 

2. Ifa € X<® and f € X® then a is an initial segment of f, and f is an extension 
of a, written either as a < f ora < f, if a = f [la]. Though we allow the 
notation a < f, of course we can never have a = f fora € X<® and f € X®. 


Definition 1.5.3 (Cantor space and Baire space). Fix X C w. 


1. Fora € X<%, the cylinder set of a is the set [[a]] = {f € X° : a < f}. 
2. The (clopen) topology on X“ is generated by the cylinder sets [[a@]] fora € X<®. 


When X = {0,..., k — 1} for some k (most often k = 2), the resulting space is called 
Cantor space. When X = w, the resulting space is called Baire space. A standard 
fact is that X° is compact if and only if X is finite. 


Definition 1.5.4 (Finitary functions). Fix X C w. 


1. A finitary function on a set X is a function f: X" — X for some n € w, in 
which case it is also called an n-ary function on X. 

2. Fix k € wand X Cw. Ann-ary function f on X is k-valued if f(x) < k for all 
x € X", We sometimes write f: X" — k in place of f: X" > {0,...,k — 1}. 
Thus, when X = w and n = 1, a function f is k-valued precisely if f € k®. 

3. A finitary relation on a set X is a subset R of X" for some k € w, in which case 
it is also called an n-ary relation on X. 

4. The characteristic function of an n-ary relation R on X is the function yr: X” > 
2 such that for all x € X”, yr(x) = 1 if and only if x € R. 
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A |-ary relation on a set X is thus just a subset of X. A 0-ary function is technically 
a function f: {()} — X, but we identify it with the unique element of X given by 
f(«)). Since X € wand () ¢ w, the only 0-ary relation is @. Notice that if {0,1} ¢ X 
then the characteristic function of an n-ary relation on X is an n-ary function on X. 


Convention 1.5.5 (Identification of sets and characteristic functions). We identify 
all finitary relations on w with their characteristic functions. In particular, we use the 
term “set” (or, more precisely, “subset of w”’) interchangeably with “element of 2°”. 
So for example, if A is a set we can write either x € A or A(x) = 1, as convenient; 
given f € w® we write f(x) # A(x) to mean that if f(x) = 0 then x € A and if 
f(x) # 0 then x ¢ A, etc. In this way, too, any definition or result formulated for 
finitary functions on w automatically extends to finitary relations. 


We fix an effective pairing function, which is a bijection w? — w coding pairs 


of numbers by numbers. For a concrete example, see, e.g., Soare [295, p. xxxii]. 
For an example formalized in second order arithmetic, see Theorem 5.3.4 and Exer- 
cise 5.13.25 below. For our discussion outside of formal systems, the specific choice 
does not matter. We denote the code for the pair (x, y) by (x, y). 


Definition 1.5.6 (Effective joins). Fix Xo, X1,... € w. 
1. For eachn € w, the (finite) join of Xo, ..., Xn—1 is 


DX; = Xo @ + © Xp = {i,a) 21 < MAX E Xj}. 


2. The (infinite) join of Xo, X\,... is Dice X, = {(i,x) i < wWAx € Xj}. 


The join is also called the Turing join. We may also use the notation (Xo, ..., Xn-1) 
in place of Xp @--- ® X,-1, and (X; : i € w) in place of B,_,, Xi- 


The join X9@X of two sets is sometimes defined as {2x : x € Xp}U{2x+1: y € Xj}. 
This has the advantage of being notationally lighter since it avoids the use of the 
pairing function. The definition in (1) above, by contrast, has the advantage of being 
consistent with the definition of the infinite join. But the choice will not matter for 
any part of our discussion below, as we can move between the two without affecting 
any of our arguments. So we will use them interchangeably, as convenient. 


Definition 1.5.7 (Projections). For X C w x w andi € w, X!l = {x ew: (i,x) € 
X}. Thus, if X = @,-,, Xi or X = @;-,, Xi then X"'! = X; for all i. 
Definition 1.5.8 (1 notation). If f is a function of multiple parameters f(x, y), the 
notation 
Ax. f (X,Y) 

refers to the function of tuples x induced by fixing particular values for y. For 
example, f(x) = Ax.x’ is a function which computes the yth power of its input, 
assuming y is fixed by context. 


We use = to denote literal equality of formulas in formal languages. For example, 
@® = X = Y means that © is the formula “X = Y”. 


Part I 
Computable mathematics 


Chapter 2 ® | 


Check for 


Computability theory cpa 


One of our primary tools for studying the difficulty of producing a solution to a 
problem will be computability. The theory of computability is a field in its own 
right, of which we will need only certain pieces. For a complete introduction to 
classical computability, we refer to Soare [295] or Downey and Hirschfeldt [74]. A 
reader who is familiar with the basics of computability theory, or who is willing to 
take them for granted temporarily, may wish to skip to Chapter 3. 


2.1 The informal idea of computability 


Classical computability theory begins with the question of which functions from w 
to w are “effectively calculable” or “algorithmically computable by a human”. Of 
course, these terms appear in quotes because they are informal, intuitive concepts, 
and as such can never be completely answered by mathematics. But the early part 
of the 20th century saw a flurry of attempts at giving formal models of computation 
to capture all those functions anyone could reasonably regard as “effectively calcu- 
lable” in the informal sense. This culminated, in 1936, with Turing’s now-famous 
model of Turing machines and, perhaps equally importantly, his highly compelling 
philosophical argument for why his model succeeds in the above regard (see Tur- 
ing [314]). Today, the premise that this is the case—that the functions that can be 
computed via Turing’s model are exactly the “effectively calculable” functions—is 
called the Church-Turing thesis. For an account of Turing’s argument, and more 
about the history of its development, see Soare [294]. 

Functions from w to w are a natural place to begin because there are many clearly 
reasonable ways to represent natural numbers (e.g. binary notation) that allow them 
to be manipulated concretely. This allows us to focus more directly on the algorithmic 
natural of the computation. In more general settings, the individual objects of study 
(e.g. real numbers or continuous functions from R to R) will need to be represented 
in a more concrete way to allow us to compute with them. 

There are a few key facets of algorithmic computation that bear mentioning. 
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¢ The computation is deterministic: if the same input is provided multiple times, 
precisely the same computational process will be followed, and same result (if 
any) will be produced. 

¢ Resource limitations are ignored. A successful computation may take an arbitrary 
long (though finite) time to perform, and may require an arbitrary amount of 
temporary storage space (scratch paper, in the case of a human). 

e The computation must be algorithmic: no creativity or ingenuity is required of 
the person performing the computation. In principle, the entire procedure can 
be written in a finite amount of space and conveyed to another person. 

¢ The algorithm must include a way for the person performing the computation to 
determine when it is finished, and determine the specific value of the function 
that has been obtained. 


Turing proposed a particular type of mechanical computer, the Turing machine, 
that models computations with these properties. He gave a compelling argument that 
any function computable by a human in an algorithmic fashion is computable by a 
Turing machine. 

For the purposes of formalized arithmetic, it will be more convenient for us to 
consider a different formalization of computability. We will begin with a class of 
functions, the primitive recursive functions, which is a proper subset of the class 
of computable functions, but which has the advantage that each primitive recursive 
function is computable using particularly restricted means. In particular, it will be 
clear that the computation of a primitive recursive function on any input value will 
produce a result. In this sense, all primitive recursive functions are total. 

We will then define the partial computable functions as the closure of the primitive 
recursive functions under operations including an unbounded search operator, the 
Lt operator. This definition is particularly useful for our overall goal of relating 
computability with formal theories of arithmetic, because the operator is closely 
tied to numerical quantification. This comes at the cost of having to consider partial 
functions, but these are important and useful in their own right. In any case, we can 
always consider just the total functions when desired. 

It will be clear from the definition that each such total function is “effectively cal- 
culable”: computable, in principle, by a human with pencil and paper. The converse 
is of course more difficult, as it is not a mathematical argument but a philosophical 
one. In this regard, the definition using the yz operator is not very persuasive, in 
spite of its technical advantages. Instead, the standard argument here is to show that 
the total computable functions are co-extensive with the total functions that can be 
computed by Turing machines, and then appeal to Turing’s original argument about 
the latter class. The more skeptical reader may, perhaps, be better convinced by an 
empirical fact: in the 90 years since Turing gave this argument, no counterexamples 
to the Church—Turing thesis have emerged. 
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2.2 Primitive recursive functions 


The class of primitive recursive functions is the smallest collection of functions 
containing certain basic functions and closed under certain operations. We begin 
with the former. 


Definition 2.2.1 (Basic functions). The basic functions are a particular class of 
finitary functions on w. They are: 


1. For each n > 0 andi <n, a projection function P” defined by 
P?' (Xo, seh »Xn-1) = Xj 


for all (x9, ...,%n-1) € w”. 

2. The successor function defined by S(x) = x +1 forall x € w. 

3. The constant unary zero function Z(0) = O and the constant O-ary function 
Zo = 0. 


The two basic operations for primitive recursive functions are generalized com- 
position and primitive recursion. 


Definition 2.2.2 (Generalized composition). Fix n,m € w. If f is ann-ary function 
on w and, for each i < n, g; is an m-ary function on w, the generalized composition 
function C(f, go,.--,%n—1) is the m-ary function satisfying 


C(f, 805+ ++ 8n-1) (Xo, +++. Xm-1) = 
f (go(X0, ---Xm-1)5+ +++ 8n—-1(X0,-- 5 Xm-1)) 


for all (xo,...,Xm-1) € w’. A class of functions F is closed under generalized 
composition if it is closed under the operator C. 


Definition 2.2.3 (Primitive recursion). Fix n € w along with f: w” — w and 


g: w"t! — w. Then R,,(f, g) is the unique function satisfying 


Ri(f, g)(0, xo, oS »Xn-1) = t (xo, are »Xn-1)> 
Ril f.g)(y + 1x0, ---.Xn-1) = (Raf, 8) (y, X0,- - 5 Xn-1)sX05- ++ Xn-1)s 


for all (xo,...,Xn-1) € w” and y € w. A class of functions F is closed under 
primitive recursion if it is closed under the operator R. This operator is a special 
case of the general concept of a recursor from proof theory. 


It is immediate that the function R,(f,g) is well-defined and unique for all 
appropriate f and g. Moreover, there is a natural algorithm for computing it. Given 
inputs y,x09,...,Xn-1, We first compute so = f(xo,...,Xn-1), then we compute 
S} = g(s9,X0,---,Xn-1), then sg = g(s1,X0,---,Xn—-1), and we continue this way 
until we have computed s,, which is the value of R,,(f, g)(y,X0,.-~»%n-1)- 
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Definition 2.2.4. The class of primitive recursive functions is the smallest class 
of finitary functions on w that contains the basic functions and is closed under 
generalized composition and under primitive recursion. 


We are often interested in the ability to “decide” a set, that is, to calculate the 
characteristic function of the set. What we are doing, then, is finding an algorithm to 
correctly determine which elements are members of the set. Of course, using classical 
logic, each n € w is, or is not, an element of a set A. The question of interest is: 
how hard is it to determine which option holds? For our purposes, the easiest sets to 
decide have primitive recursive characteristic functions. This method of converting 
questions about sets into corresponding questions about functions appears very often 
in computability. 

A key insight, which arrived early in the study of computability theory, is that 
little changes if we allow extra functions as basic functions. This concept, known 
as relativization, now appears throughout computability theory. Rather than only 
studying functions that are computable, the field is interested in which functions are 
computable “relative to” other functions which may not themselves be computable. 
(See the discussion in Section 2.7.) For this reason, it is often said, only partially in 
jest, that computability theory is primarily about noncomputable functions. 


Definition 2.2.5. Let g be a finitary function on w. The class of functions primitive 
recursive relative to g (or primitive recursive in g) is the smallest class of functions 
that contains g and all the basic functions and is closed under generalized composition 
and primitive recursion. 


When we take an arbitrary function g: w — w and use it as if it was a basic 
function, we describe g as an “oracle”. The idea is that we may need to occasionally 
“consult” g for information that we cannot obtain by primitive recursive means 
alone. And indeed, if g is not itself primitive recursive, then the overall function we 
compute might not be primitive recursive. 

We often want to view oracle functions like g above as higher-level parameters 
to the function we are computing. Suppose f(7) is a primitive recursive function 
relative to a function g. Because the class of functions primitive recursive in g 
is defined with a closure condition, f can be associated with a particular finite 
construction tree. Each node of the tree is labeled with a basic function, one of the 
construction rules C or R, or with g. We can thus view g is a kind of parameter to 
the tree: for every h the same tree would define a function primitive recursive in h, 
if we simply replaced g with h. 

We can thus view f as a function f(n, g) of both v and g. In particular, if 7 
is a single variable n, we have an induced map F(g) = An.f(n, g) from functions 
to functions. We call F a primitive recursive functional. In general, we will use the 
notation h(n, g) to refer to a primitive recursive function defined by some (unnamed) 
construction tree in which some leaves are labeled with the oracle function g. 
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2.2.1 Some primitive recursive functions 


In this section, we show that certain functions are primitive recursive. These functions 
will be required for the deeper results in the following sections. 

The direct way to show that a function is primitive recursive is to write a definition 
of f in terms of the basic functions and the operators C and R. Such definitions are 
difficult to read, however. Instead, we typically write the definition in ordinary 
mathematical notation, via a set of recursion equations, relying on the reader to 
translate the informal definition into a formal one. 

To provide an example of this method, we now show that the identity function 
I(x) = x is primitive recursive. We typically do this by giving a set of recursion 
equations. For /, the recursion equations are 


1(0) =0, 
I(x+1)=J(x) +1. 


To translate these equations into a formal definition, we first identify the functions 
on the right hand sides of the equations as Zp and S(/(x)). Thus 


I(x) = R(Zo, S)(x). 


Of course, we also have I(x) = P : (x). In general, there can be many ways to construct 
a primitive recursive function. 

The trick in the next theorem is to use the two cases of the primitive recursion 
operator as a substitute for the “if” statement of a programming language. 


Lemma 2.2.6. The modified subtraction function ~ defined for all x, y € w by 


0 ify > x, 
x=y= : 
x-—y otherwise, 
is primitive recursive. 


Proof. We first define an auxiliary function P(k) with the recursion equations 


P(0) =0, 
P(k+ 1) =k. 


Thus P(k) is k — lif k > 0 and O if k = 0. Now we define f(m,n) =m —nas 


f(m,0) =m, 
f(m, k +1) = P(f(m, k)). 


Formally, the m in the first case should be /(m), where J is the identity function 
defined above. We will not comment again on trivialities such as this. oO 
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Lemma 2.2.7. The function E(n, m) defined so that 


0 ifm=n, 


1 otherwise, 


E(n,m) = 


is primitive recursive. 


Proof. We first define an auxiliary function Eo(n) by primitive recursion, as follows: 


Eo(0) = 0, 
Ey(k +1) = 1. 


Then E(n, m) is defined as 
E(n,m) = Eo((m =n) + (n= m)). 


Now E(k) = 0 if and only if k = 0, and (n = m) + (m ~ n) equals 0 if and only if 
m =n, So the definition is correct. oO 


2.2.2 Bounded quantification 


Per Convention 1.5.5, the definition of being primitive recursive extends to sets and 
other finitary relations by identifying these with their characteristic functions. Thus, 
for example, the relation x = y is primitive recursive, since its characteristic function 
is the function 1 + E(x, y) as defined in the previous section. Finitary relations on 
w are closely related to syntax, because the standard connectives from propositional 
logic all have direct set theoretic interpretations. Indeed, if R and S are both n-ary 
relations on w, we can define RV S=RUS,RAS = ROS, and aR = w" \ R. The 
following is an easy but important observation. 


Proposition 2.2.8. Fix n € w and let R and S be n-ary relations on w. If R and S 
are both primitive recursive then so are RV S, RAS, and aR. Thus the class of 
primitive recursive relations is closed under logical connectives. 


Proof. We have 


RV S=min{1,R+ S$} 
RAS=R:-S 
AR=1-R, 


where on the right hand side we are treating R and S as functions. (Note that min, +, 
and - are all primitive recursive functions.) oO 


If R(x) is arelation and m € w is fixed, we may ask whether every x < m satisfies 
R, or ask whether there is at least one x < m satisfying R(k). (Recall that x < m 
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means that every element of the tuple x is bounded by m.) These bounded quantifiers 
will play an important role in the sequel because, if R is effectively calculable and 
m is known, then the relations 


(Ax < m)R(x) = {m: (Ax) [x < mA R(x)]}, 
(Vx < m)R(x) = {m: (Vx)[x < m > R(x)]} 


are also calculable. Intuitively, this is because deciding the formula for a particular m 
only requires testing the finite number of tuples less than m. This stands in contrast 
to the usual “unbounded” quantifiers: even if we have an algorithm to determine 
whether R(m, x) holds for each m and x, there may be no algorithm for the relation 
E(m) = (Ax)R(m,x) or the relation A(m) = (Vx)R(m,x). We will see many 
concrete examples of this in Section 2.5. 


Theorem 2.2.9. If R(x) is a primitive recursive relation, then so are (Vx < m)R(x) 
and (Ax < m)R(x), as functions of m and any other inputs of R. 


Proof. First, by duality, we have 
(Ax < m)R(x) © 7A(Vx < m)[AR(x)]. 


Thus, because the class of primitive recursive relations is closed under negation, it 
is enough to consider only the bounded universal quantifier. By definition, we have 


(Vx < m)R() = I] R(x). 


x<m 


Therefore it is enough to show that if f: w — w is primitive recursive then so 
is g(m) = [zen f (Kk). This is straightforward. First define a function h by the 
recursion equations 


h(0) = f(0), 
h(k +1) = f(k+1)-h(k). 


and then define 


g(0) = 1, 
g(k +1) =A(k). 


Remark 2.2.10. Although the use of recurrence relations in the previous proof is 
already a simplification from using the generalized composition and primitive recur- 
sion operators directly, the proofs may still appear formalistic. As in other areas of 
logic, some formalism is unavoidable, especially when setting up the basic theorems. 
We will soon have enough tools, however, to avoid needed to write complex recur- 
rence relations. There was a period in the mid 20th century when papers relied on 
derivations using recurrence relations (for example, Kleene [181]), but contemporary 
work in computability theory tends to avoid them unless needed. 
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The closure of primitive recursive relations under bounded quantification makes 
it much easier to prove that relations of interest are primitive recursive. The following 
result, which will be important in the next section, demonstrates how this method is 
used. 


Theorem 2.2.11. The relation “m is a prime number” is primitive recursive. 
P P 


Proof. First, note that the relation A(p,q,m) = (p- q = m) is primitive recursive. 
Thus we may define a primitive recursive relation 


R(m) =m 22 A->(Ap < m)(Aq < m)[p-q =m]. 


It is immediate that R(m) represents the relation “m is prime”, because a number 
m > 2 is the product of two strictly smaller factors if and only if it is composite. O 


2.2.3, Coding sequences with primitive recursion 


Coding is a key technique in reverse mathematics: a mathematical object of one 
kind is represented as an object of a second kind, so that manipulating objects of 
the second kind indirectly manipulates objects of the first kind. The most basic kind 
of coding happens outside our formalization: we treat natural numbers as atomic 
entities with no internal structure, but a human computing with natural numbers will 
code them using binary or decimal representation (or some other method) to write 
them down. 

The next most basic kind of coding allows us to represent a tuple of natural 
numbers using a single natural number. To achieve this, we will need a coding 
function that produces a number (“code”) for the tuple; a length function that tells 
the length of the original tuple given its code; and a decoding function that gives 
back the elements of the tuple from the code. The definition below is designed in 
a way that each function has codomain w, matching the way the class of primitive 
recursive functions was defined. 


Definition 2.2.12. A coding system for finite sequences consists of the following. 


* (Coding): A sequence (f, : n > 0) of injective functions f,: w” > w \ {0} 
such that every s € w* is in the range of at most one fy. 

¢ (Length): A function lh(s) which, given s € w*, returns the unique n such that 
s is in the range of f,,, or 0 if s is not in the range of any fh. 

¢ (Decoding): A function z(s,i) such that if i < Ih(s) and 


fa(Xo, + «+5 Xk-1) = 8, 
then 2(s,i) = x;. 


We will be interested in coding systems for sequences in two settings: the primitive 
recursive functions and arithmetic. The role of these coding systems is, in a sense, 
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to allow for a function or formula with a fixed number of parameters to act as if it 
has a variable number of parameters, by taking as input a code for a sequence and 
then pretending that the values of that sequence were all passed as parameters. In 
this way, we can limit our considerations to functions that take at most one input. 


Theorem 2.2.13. There is a coding system for sequences in which all the functions 
Jn. the function lh, and the function n are all primitive recursive. 


One option is to use our pairing function (defined in Chapter 1) to code pairs of 
numbers, and then define codes for longer sequences recursively. (See also Theo- 
rem 5.3.4.) Another option is to use functions of the form 


n 

1 2 

tn(Xo, tee »Xn-1) = ler 
i=l 


where p; is the ith prime. In this case, the function lh(s) will determine the number of 
distinct prime factors of a number s, and the function 2(s, 7) will return the exponent 
of the ith prime in the prime factorization of s. The proof that these functions are 
primitive recursive is Exercise 2.9.2. Although this provides a simple coding scheme, 
the functions here require more work in the context of formal arithmetic, where it 
is not as straightforward to work with an enumeration of the prime numbers. In that 
context, we use a coding system based on Gédel’s 6 function, as we will see in 
Definition 5.5.5. 

With Theorem 2.2.13 in hand, we can talk about finite sequences informally, as 
in Definition 1.5.1. In actuality, we will always implicitly be referring to codes. For 
example, if we consider the function g: 2° — w that assigns each finite binary 
string its length, then formally g is defined on the set of codes, i.e., the set of s € w 
with lh(s) # 0. 


2.3 Turing computability 


As we continue our study of computability, it will be necessary to consider partial 
functions on w. These are functions defined on a subset of w but possibly not on all 
on w. The fundamental reason we need these functions is that an algorithm may only 
work correctly for certain input values. For example, considering natural numbers, 
we can only find square roots for numbers that are perfect squares. Thinking of the 
algorithm as giving a function, we will call the set of legitimate inputs the domain. In 
the case of the square root function, we can effectively determine if a natural number 
has a natural number square root. However, we will soon see algorithms for which 
there is no way to predetermine effectively whether a given input is in the domain or 
not. If we want to think of these algorithms as giving functions of some kind, they 
must give partial functions. 

There is standard notation in computability to work with partial functions. We 
write f(x) | (which is read as “f(x) converges”) to indicate that f is defined on 
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input x; we write f(x) |= y to indicate that f(x) is defined and equals y. We write 
f(x) T (ead “f (x) diverges’) to indicate that f is not defined on input x. We write 
f = g for functions of the same arity n to indicate that for all x € w”, if f(x) and 
g(x) both converge then they take the same value. This is different from writing 
f = g, by which we mean that for all x € w”, either both f(x) and g(x) both diverge, 
or they both converge and take the same value. 

In computer programming, the “while” loop construct makes loops that may run 
forever. The ys operator, which we define now, is most naturally programmed using 
this kind of loop. The wu operator says, informally, to perform a certain calculation 
for x = 0, x = 1, x = 2, ..., and halt once we find a value where the calculation 
succeeds. 


Definition 2.3.1. Let f be a partial (7+ 1)-ary function on w. The function M(f) isa 
partial k-ary function defined as follows: for x9,...,Xn-1 € w, M(f)(x0,.--,Xn-1) 
is the smallest y such that f(y, x0, .--,Xn-1) l= 0 and f(y*,x0,...,%,-1) [#0 for 
all y* < y, if such a y exists, and M(f)(xo0,...,Xn-1) T if no such y exists. We often 
write M(f) using the yu operator, so that 


M(f)(X0,---+Xn-1) = (HY) LOY, X0,- ++ Xn-1) L= 0]. 


Definition 2.3.2. The class of partial computable functions is the smallest class of 
finitary functions on w which includes all primitive recursive functions and which 
is closed under generalized composition, primitive recursion, and applying the M@ 
operator. A function is computable if it is partial computable and total. 


To avoid acommon confusion with the terminology, we emphasize that the term “par- 
tial” in the previous definition refers to “partial function’, not “partially computable”! 
For a set, it only makes sense to be (total) computable, not partial computable, be- 
cause characteristic functions are always total. 

Let us now turn to relative computability, i.e., computability of one set from 
another. To get a picture of what this refers to, consider an arbitrary A C w. Al- 
though A may not be computable itself, there is a natural sense in which “if A was 
computable, then w \ A would also be computable”. We again describe this in terms 
of oracles. Imagine we have an oracle that can tell us whether each given number 
is in A—essentially, we assume the oracle can somehow calculate the characteristic 
function of A. To tell whether a number n is in w \ A, we first ask the oracle whether 
the number is in A, and then we return the opposite answer. In a similar fashion, 
many other sets can be computed “from A”, e.g., {2n : n € A}, the set of prime 
numbers in A, etc. 

Of course, we might use a much more complicated procedure to compute one set 
or function from another. Our definition of computable functions can be extended to 
make the notion of relative computability precise. 


Definition 2.3.3. Suppose that g is a finitary function on w. The class of functions 
partial computable from g (the class of partial g-computable functions) is the small- 
est class which includes every function that is primitive recursive in g, and is closed 
under generalized composition, primitive recursion, and applying the py operator. 
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A function f is computable from g (or g-computable, or Turing reducible to g), 
written f <r g, if it is partial computable from g and total. 


As with primitive recursion, if g is not itself computable then there will be 
functions computable from g that are not computable in the ordinary sense, including, 
of course, g itself. The next theorem follows immediately from our definitions. 


Theorem 2.3.4. The relation <ry satisfies the following. 


1. (Reflexivity): For every g, we have g <r g. 
2. (Transitivity): For all f, g,h, if f <r g and g <r h, then f <7 h. 


Remark 2.3.5 (Other computable objects). We overload the term “computable”, or 
more generally, “g-computable”, by using it for anything that can be coded by a 
finitary function on w. For example, we say a sequence of numbers (x; : i € w) is 
computable if there is a computable function f: w — w with f(i) =x; for all 7. We 
say a sequence of sets (X; : i € w) is computable if there is a computable set X C w 
such that X!'] = X; for all i. (Recall that X!] = {x : (i,x) € X}, where (i, x) denotes 
a coded pair as in Section 2.2.3. So, (X; : i € w) is computable if QB; <,, Xi is 
computable.) We will see more complicated ways of coding mathematical structures 
by sets in Section 3.4. 


In general, a reducibility notion on a particular class of problems is a reflexive, 
transitive relation between problems. Relations of this kind are sometimes called 
preorders (also quasiorders). The following proposition explains the way that so- 
called degree structures are formed from preorders. We will see this phenomenon 
many times as we study additional reducibility notions. The result is so well known 
in the field that the corresponding conclusions for a new reducibility notion are often 
taken for granted with no explicit statement. 


Proposition 2.3.6 (Degree structures). Let < be a preorder ona set A. 


1. Define a,b € A to be <-equivalent if a < b and b < a. Then <-equivalence is 
an equivalence relation on A. We will write [a] for the equivalence class of A. 

2. Let A* be the set of equivalence classes of <-equivalence. Then < induces a 
relation <* on A* with the rule that [a] <* [b] if a < b. The relation <* is a 
partial order on A*, that is, a reflexive, antisymmetric, and transitive relation. 
We almost always use the same symbol < to refer to <* as well. 


We call the equivalence classes of A under < the <-degrees. 


Applying the proposition to the relation <7, with A = P(w), gives the definition 
of the Turing degrees. 


¢ Two functions (or sets) are Turing equivalent if each is computable from the 
other. We denote Turing equivalence by =r. 

* The Turing degree of afunction g is the collection deg(g) (also denoted deg;(g)) 
of all functions Turing equivalent to g. 

¢ The Turing degrees are themselves partially ordered by <r. 
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The structure of the partial order of Turing degrees is one of the central questions 
in “classical” computability theory. It has been the focus of enormous amounts of 
research (see Simpson [284] for a partial survey). 

At the same time, the Turing degrees give us a precise way to measure the com- 
putational difficulty of mathematical problems. If one particular problem P entails 
computing a function f, and another problem Q entails producing a function g, with 
J <r g, then the second problem is computationally stronger in a formally defined 
sense. Most mathematical problems of interest do not simply ask us to produce a 
single function, however. In the next chapter, we will discuss instance—solution prob- 
lems, which are more complex. Nonetheless, the methods of computability theory 
will still be vital to studying them. 


2.4 Three key theorems 


In this section, we will state three key theorems about the class of partial computable 
functions. We will provide brief sketches of the key ideas of each proof. Complete 
details are given by Soare [295, 293] and many other texts on computability. 

The first of our three theorems shows that there is an effective indexing of the 
partial computable functions: i.e., a way to assign a natural number index e to each 
partial computable function f. This number e is essentially a program for computing 
jf, encoded as a natural number. The primitive recursive functions T and U from 
the theorem carry out the computation, given e and the inputs and oracles of the 
function. The main complication in the proof is that we are working with numbers, 
rather than with binary strings that electronic computers typically manipulate. Using 
numbers will facilitate formalizing computability into arithmetic later. 


Theorem 2.4.1 (Kleene’s normal form theorem). Let k > 0 be fixed. There are 
primitive recursive functions T = T* and U = UK such that, for every partial 
computable k-ary function f with oracle function g there is an e such that for all n 
of length k: 

f(a) = U(us [T(s,e,n, g fs) = 0]). 


Proof (sketch). The proof is based on the fact that computation is carried out in a 
step-by-step manner. Essentially, T(s, e, 7, 0) returns 0 if s is acode for a sequence of 
computational states for program e with input 7 and an oracle extending o,, so that the 
computation enters (and thus remains in) a halting state within the sequence of steps 
coded by s. If any step attempts to use oracle information beyond the finite sequence 
o, T views the computation as nonhalting. The function U(s) takes a sequence of 
these computational steps, verifies that the computation entered a halting state, and 
then returns the value produced when the computation halted. 

The value of s plays two roles. The number s itself encodes the sequence of steps 
that the computation performed, so the number of possible steps increases without 
bound as s —> oo. In addition, the value of s such that T(s, e,n, g[s]) = 0 serves asa 
bound on how much information was used from the oracle g during the computation. 
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Again, this limit increases without bound as s > oo. In order for T(s, e,n, g[s]) to 
return 0, s must be large enough to code the entire sequence of steps and also large 
enough that no oracle information was used beyond g ['s. oO 


Although the s in our statement of the theorem is a code for a sequence of states, 
not a simple count, we often refer to s as if it counts the “steps” of the computation. 
Formally speaking, for each e, n, and g we can effectively produce a sequence so, 
sj, ... that corresponds to the sequence of steps taken in the computation. Other 
variations of the theorem use s to count the steps, and include additional parameters 
to U. 

Because the parameter k is always clear from context, we typically do not mention 
it explicitly. Thus, for example, the following definition should really be read as a 
family of definitions, one for each k > 0. 


Definition 2.4.2 (Universal computable function). We define the partial function 
E(e, 7, g) as 
E(e,n, g) = U(us [T(s, e,n, g fs) =0)). 


The function Sis a “universal” function for all k-ary partial computable functions: 
by simply changing the value of the parameter e, every partial computable k-ary 
function with oracle g can be obtained. On the other hand, & itself is immediately 
seen to be a partial computable (k + 1)-ary function relative to g. 

As with primitive recursive functions, we can view g as a parameter to &. This 
gives us a collection of Turing functionals, each of which takes a function from w to 
w as input and produces a partial function from w to w as output. 


Definition 2.4.3. A (Turing) functional is a function ® defined on w® such that for 
some e € w, O(g) = An.E(e,n, g) for all g € w®. We call e an index for ®, and if 
@(g) = f then we callea Ae index for f (or A° index for f, if g = ©). (We will 
better understand the reason for this terminology in Section 2.6.) We say ® is k-ary, 
where k is the length of 71. 


Convention 2.4.4 (Notation for functionals). We reserve = for the function in Defi- 
nition 2.4.2. We let ®, denote the Turing functional with index e € w. We also use 
this to denote the partial computable function ®.(@), i.e., the partial computable 
function whose A° index relative to @ is e. Which of these senses the notation is 
being used in will always be clear from context. When we wish to talk about Turing 
functionals without needing to specify their indices, we will use other capital Greek 
letters (A,T', ®, VY, etc.). In general, if ® is a Turing functional and g is an oracle 
function, we use ®8 and ®(g) interchangeably. Thus, we write things like ®,(g) (0), 
A& (17), etc., according to what is most convenient. 


Itis a standard fact, known as “padding”, that every Turing functional has infinitely 
many different indices. A related result is the following. 


Proposition 2.4.5 (Indices and uniformity). Let g be a finitary function on w and 
(fj 1 i € w) a sequence of functions. Then (f; : i € w) is g-computable if and only if 


. . ; —— 
there is a computable function h: w — w such that for every i € w, ® (4) = fi. 
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The contrast here is between g knowing only how to compute each /; individually 
(meaning that, for each 7, there is some e which is a Ae index for f;), as opposed 


to knowing how to compute all of the f; in sequence (so that for each i, a A? 
index for f; can be found computably). The latter property is known as uniformity: if 
(fi : i € w) is g-computable, then we say that f; is uniformly g-computable in i. As 
a concrete example, fix a noncomputable set A, and consider the sequence of initial 
segments, (A fi: i € w). For eachi, A [i is finite, and hence computable, but it is 
not uniformly computable (in 7). Indeed, (A [i : i € w) computes A, and so cannot 
itself be computable. 

An important consequence of our definitions is that a halting computation, being a 
finite process, only “uses” a finite amount of information from the oracle, because the 
computation has only a finite number of steps when it could query specific values of 
the oracle. This gives the name to the proposition below. First, we add the following 
definition. 


Definition 2.4.6 (Use of a computation). Fix e € w, let ® be the Turing functional 
with index e. 


1. Forn € w anda € w<®, we write B® (n) | if there is a sequence s with 
Ih(s) < |a| such that T(s,e,n,@) = 0 and ®°(n) T otherwise. In the former 
case, we also write B°(n) = y for y = U(us [Ih(s) < |a|AT(s, e, 2, @ fs) = OJ). 

2. We write ®8(n)[t] for ®& '*(n). The least r, if it exists, such that B8(n)[t] | is 
called the use of the computation ®8 (7). 


Proposition 2.4.7 (Use principle). Let ® be a Turing functional. 


1. Monotonicity of computations: For a € w<® and n € w, if B?(n) |= y then 
@4(n) |= y for all B € w<©@ such that a < B, and also ®8(n) |= y for all 
g € w® such that a < g. 

2. Every convergent computation has a use: For all g € w®, ®8(n) |= y if and 
only if ®§(n)[s] |= y for some s € w. 


Part (2) above expresses that Turing functionals are continuous as functions 
w®” — w® in the Baire space topology discussed in Section 1.5. We add one 
more convention, largely for convenience. 


Convention 2.4.8. If ®, is a Turing functional, we follow the convention that for all 
g € w” andn = (no,...,n_-1) € w, if 8 (n)[s] | then n; < s for alli < k. If ®, 
is unary and ®8(n) |, we also assume that for every n* < n there is an s* < s such 
that B§(n*)[s*] J. 


The next two theorems establish additional properties of the effective indexing 
from the normal form theorem. Their proofs use additional properties of the indexing 
not evident in the statement of the theorem. However, the ability to prove these 
theorems is a key property of the indexing that is constructed in the proof of the 
normal form theorem. 

As with primitive recursive functions, the key tool for proving these theorems 
is the definition of the partial computable functions as the closure of certain basic 
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functions under certain operations. This gives a finite construction tree for each partial 
computable function f in which some nodes are labeled with oracle functions. Given 
a way to compute the oracle functions, we can then use the tree to compute arbitrary 
values of f. The index e constructed in Kleene’s normal form theorem is essentially 
a concrete representation of the construction tree. 

In this way, the index is a natural number that does not itself contain an ora- 
cle function. Instead, the index—which is a kind of representation of a computer 
program— includes instructions to query the oracle functions at specific points during 
a computation. The person computing the function is then responsible for producing 
the correct values of the oracle function as required during the computation process. 

The following proposition shows that the indexing from the normal form theo- 
rem is “effective on indices”: if we have an operation of generalized composition, 
primitive recursion, or minimization, we can compute the index for the resulting 
function from the indices of the functions the operation is applied to. Moreover, 
we can effectively produce an index for a function that chooses between two given 
functions based on whether two inputs are equal. The proof follows from our ability 
to manipulate the finite construction trees in an effective way. 


Proposition 2.4.9. The indexing from the normal form theorem has the following 
properties. Here g is an arbitrary oracle. 


e Effective branching: There is a primitive recursive function d(i, j) so that 


; = OF (x, Zz) ifx=y, 
Di yO D= ee. ays 
J 0; (x,z) ifx#y. 


e Effective composition: There is a primitive recursive function c so that 


g 2) - 08 > 

Dis a (Zz) = O; (®;, (1), sey ®,;, (z)). 

e Effective primitive recursion: There is a primitive recursive function r that, given 

an index i for a function ®;(n, x,t) and an index j for a function ® ; (1), produces 
an index r(i, j) that applies the primitive recursion scheme to ®; and ®;: 


, a 08 (z ifn =0, 
Dri (1,2) = i( ) im if 
EM @? (n, ®,(¢,,;)(n-1,z)) ifn >. 
Here D* i) (n,t) T if any of the computations along the way diverges. 
e Effective minimization: There is a primitive recursive function m so that, for all 


iand all z, 
Of (2) = (ua) [Of (x,2) = OL. 


The second of our three key theorems is the parameterization theorem, more 
commonly known as the S!” theorem after the functions that Kleene originally used to 
state it. The theorem says that if we have a computable function f(s, t), we can hard- 
code a value k for the input s to produce a computable function g(t) = At. f(k,t). 
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Moreover, the index for g can be computed by a primitive recursive function given 
the index of f and the value for k. The proof follows, again, from our ability to 
effectively manipulate the construction tree encoded by the index. 


Theorem 2.4.10 (S)” theorem, Kleene). For all m,n € w, there is an (m+ 1)-ary 
primitive recursive function S™ such that for all e € w and x € w™, Si(e,x) is an 
index of an n-ary partial computable function satisfying 


for all finitary functions g and all y € w". 


The final key theorem allows us to create programs that, in a way, have access to 
their own index. This is the recursion theorem, also known as Kleene’s fixed point 
theorem. 


Theorem 2.4.11 (Recursion theorem, Kleene [180]). For every computable func- 
tion f : w — w there is an index e such that ®8 = De) for every oracle g. 

The fixed point theorem is extremely useful for defining functions by recursion. 
The next proposition gives a concrete example, and its proof illustrates the utility of 
Proposition 2.4.9, 


Proposition 2.4.12. The function A(n,x) defined by the following recurrence is 
computable: 


A(0,x) = 2x, 
A(n+1,x) = An(An(--+An(1)---)), 


-—_—— 
x times 


where A,(x) denotes A(n, x). 


Proof. There is a primitive recursive function f (7) that creates an index for a function 
® ¢ (i) (s, x) that does the following. 


e If s =0, return 2x. 
¢ If s = n+1, use primitive recursion to compute a sequence rg,...,7, where 
ro = Land ry) = ®;(n, rm) for 1 < m < x. Then return ry. 


To show that f is computable, we first note that we can compute an index to 
perform the operation in each bullet. The first bullet is a primitive recursive function 
of x, and thus has an index by the statement of the normal form theorem. To show 
there is a primitive recursive function to find an index of a function to perform 
the primitive recursion in the second bullet from the index 7, we use the effective 
composition and effective primitive recursion properties from Proposition 2.4.9. We 
also use the fact the each of the basic primitive recursive functions has an index. 

Once we have a primitive recursive function to compute an index for each bullet, 
it follows from the effective branching property that f is primitive recursive. 
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By the recursion theorem, there is an index p so that ®, = My (p). A straight- 
forward induction on n shows that ®,(n, x) = A(n, x) for all n. Essentially, given a 
value for n, the index p can “use itself” in place of index 7 to compute any needed 
results for smaller values of n. oO 


A less useful but perhaps equally interesting consequence of the fixed point 
theorem is the ability to produce functions the return their own index. Programs 
corresponding to indices of this kind are often called “quines” in honor of Willard 
Van Orman Quine. 


Proposition 2.4.13. There is an index e so that ®,(0) = e. 


Proof. Let i be an index of the computable function (s,t) H s. Let f(s) be the 
function As.S ee Ss), where S$ ' is from the S’" theorem. Then f is computable (in 
fact, primitive recursive). Apply the recursion theorem to f to obtain an index e with 
®, = P(e). Then 


®_ (0) = DF (ce) (0) = Ogi (i) (0) = Pile, 0) =e. 


2.5 Computably enumerable sets and the halting problem 


For acomputable set, we have an effective procedure for deciding which numbers are 
and are not in the set. But sometimes, what we have instead is an effective procedure 
for listing, or enumerating, the elements of the set. This notion turns out to capture 
a broader collection of sets. 


Definition 2.5.1. A set X © wis computably enumerable (or c.e. for short) if there is 
a partial computable function with X as its domain. More generally, if g is a finitary 
function on w and P is a finitary relation, then P is g-computably enumerable (g-c.e.) 
if there is a partial g-computable function with {n : P(7)} as its domain. 


The previous definition may seem confusing, since it does not seem to agree with 
the intuitive description of “enumerating” a set we started with. The next lemma 
shows that it agrees with what we expect. 


Lemma 2.5.2. A nonempty set X C w is c.e. if and only if there is a computable 
function whose range is X. Moreover, if X is infinite then we may assume this function 
is injective. 


Crucially, we cannot assume that the function above is increasing. Indeed, if a set 
could be effectively enumerated in increasing order, it would be computable: to 
effectively decide whether or not a given a number x is in the set we simply wait 
either for x to be enumerated, or for some y > x to be enumerated without x having 
been enumerated earlier. 
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But there are c.e. sets that are not computable. The next definition gives perhaps 
the most famous example. Its name comes from an interpretation in which we are 
asking for a way to determine whether a particular program will eventually halt (stop 
running), or whether it will continue forever without stopping, as in an infinite loop. 


Definition 2.5.3 (Halting problem). The halting problem is the set @’ = {e ew: 
®.(e) |}. More generally, the halting problem relative to a finitary function g, also 
called the Turing jump of g, is the set g’ = {e € w: B8(e) |}. 


Itis common to call the map’: w? — 2, g + g’, the jump operator. We collect 
some of the most well-known establishing properties concerning this operator in 
the next theorem. Part (1), implicit in the original groundbreaking paper of Alan 
Turing [314], was the first clue that the Turing degrees are a nontrivial structure. 


Theorem 2.5.4. Fix a finitary function g. 


1. g <7 g’ but g’ £7 g. 

2. If h is a finitary function and g = h then g’ = h’. 

3. g’ is g-c.e. 

4. If A C wis g-c.e. then A <r g’. 

5. A set A C w is g-computable if and only if A and A are both g-c.e. 


Combining (3), and (4), we see that g’ is a kind of “universal g-c.e.” set. We will 
see a further elucidation of the properties in the above theorem in Post’s theorem 
(Theorem 2.6.2) below. That theorem will make key use of iterated Turing jumps, as 
in the following definition. 


Definition 2.5.5. Let g be a finitary function. We define g”), for each n € w, as 
follows: g) = g, and got = (g™)’, 


By Theorem 2.5.4, for every g we have a strictly increasing hierarchy 
g<re’ Sprenger) <r gi <Tor: , 


The jump operator also gives rise to the following hierarchy of functions which are 
“close to computable” in a certain sense. 


Definition 2.5.6 (Low,,). Fix n € w and a finitary function g. A set X is low, 
relative to g if (g @X)™ <z g. If g =@, then X is simply low,. 


Thus, the computable functions are the same as the lowo functions (and so the latter 
term is not used). The low, functions are typically just called /ow. Thus a function g 
is low if g’ =; @’. If a function is low,, it is of course low,,41, but the converse need 
not be true. 
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2.6 The arithmetical hierarchy and Post’s theorem 


Post’s theorem establishes a tight link between iterations of the Turing jump and 
numerical quantifiers. The connection with quantifiers is of particular interest in 
reverse mathematics because, in the context of formal theories of arithmetic, we can 
count quantifiers in specific formulas of interest. The basic conclusion is that the 
number of iterated Turing jumps required to compute a set is tied to the number of 
alternations of universal and existential quantifiers required to define the set. 

There are many versions of the “arithmetical hierarchy”. Some are syntactical hi- 
erarchies that assign classifications to formulas; others are semantical hierarchies that 
assign classifications to relations. Different versions also look at different families 
of formulas or different families of relations. In this section, we define a semantical 
hierarchy that assigns classifications to certain relations on w, beginning with the 
classification for computable relations. In Chapter 5, we will define a closely related 
syntactical hierarchy for formulas in second order arithmetic. 

This arithmetical hierarchy assigns one or more classifications, relative to an 
oracle g, to certain relations on w. The possible classifications are ros ; 8 , and 
Ads , for n € w, which may be alternatively written as Zz? r1?, and iS relative to g, 
or (less commonly) as £°(g), I12(g), and A°(g). When g is computable (or when 
no oracle is intended), it is not written, giving simply ©°, 11°, and A®. 


Definition 2.6.1 (Arithmetical hierarchy for relations on w). 
Fix n € a, let g be a finitary function on w, and let P be a finitary relation. 


1. Pis =" and also i if it is g-computable. 


2. Pis = if P(X) = (A¥)Q(¥, x) for some II?’ relation O(y, x). 


3. Pis aes if P() = (V¥)O(5,x) for some X2* relation O(¥, x). 
4. Pis A®® if it is 2% and 12%. 
Any relation that receives a classification in this hierarchy is an arithmetical relation 
(relative to g). A relation is properly =)°* or properly Ty’ if it is =% or TP%, 
respectively, but not A?’’. For n > 1, a relation is properly AM’ if it is A°® but not 
0,g 0g 
x? orll’®,. 
n-1 n-1 
Because we can introduce and quantify over dummy variables, every res or ms 
relation is Nee We often identify 22%, 118, and A®® with the collections of 
relations they describe, so that, e.g., A’ = 228 q 128, 

Note that the complexity of a relation in the arithmetical hierarchy can be measured 
by the number of blocks of alternating quantifiers that appear in its definition. For 
example, if P is = then it has the form (Ax) R(x) for some computable predicate R, 
and if P is x it has the form (Ax)(Vy)R(y, x). By induction, if P is =° for some 
n > 2 then it has the form 


(AXn-1) (WXn-2) (A%n-3) ee (Qxo)R (Xo, tee Xn-2,Xn-1), 
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Figure 2.1. The arithmetical hierarchy, showing classes of finitary relations on w under inclusion. 
Here, n > 2 is arbitrary. All containments are strict. The labels restate the characterizations of each 
class from Post’s theorem and the limit lemma. 


where Q is V or 5 depending as n is even or odd. A similar observation holds 
for TI? formulas, with the quantifiers interchanged. Geometrically, = relations 
are the projections of higher-dimensional As relations, and m8 relations are the 
complements of ns relations. 


The next theorem characterizes the overall structure of the arithmetical hierarchy. 
Figure 2.1 provides an illustration. 


Theorem 2.6.2 (Post’s theorem). Fix n € w, let g be a finitary function on w, and 
let P be a finitary relation. 


1. P is ACS if and only if it is computable from g. In particular, a relation is 
A if and only if it is computable from g. 

2. Pis = if and only if it is c.e. in g™. 

3. P is 18 


ne) P and only if its complement is c.e. in g™, 
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In addition, for n > 0, the class of Ayr’ relations is a proper subclass of the classes 
of 228 and 8 relations. In particular, neither of the latter two classes is contained 
in the other. 


Importantly, we see the identification of A’ relations with computable relations. 
This is part of Theorem 2.5.4 above, and is where we get the term “At index” in 
Definition 2.4.3. We also see the identifications of = relations with c.e. relations, 
and iT relations the complements of c.e. relations. In addition, every ae relation, 
being also Ae 4, 1S computable in @(”), which generalizes Theorem 2.5.4. Thus, 
each jump can be thought of as helping to “answer” questions with one additional 
alternating quantifier in the definition of the relation. 

Post’s theorem has the following immediate corollary, which establishes a firm 
connection between computational strength and logical syntax. 


Corollary 2.6.3. Fix n € w, let g be a finitary function on w, and let P be a finitary 
relation. Each of the classes of yee 1, and Ao8 relations is closed under the 
standard connectives of propositional logic and under bounded quantification. 


Formally, this allows us to represent various natural statements about arithmetical 
objects by other arithmetical objects. For example, given a computable set A, the 
statement that A is infinite, (Vx)(4y)[y > xA y € A], defines a 1 set, Ra(z). Since 
z does not appear in the definition of Ry, it follows that Ry, is either all of w or the 
empty set depending as A is or is not infinite. By Post’s theorem, there is an e such 
that Ra = 02", so @” can check whether or not, say, 0 € R, whereby it can “answer” 
whether or not A is infinite. (A point of caution: each of @ and w is computable, but 
there is no single index e such that R4 = ®-! In the parlance of the remark following 
Proposition 2.4.5 above, R,4 is uniformly @’’-computable from a a index for A; it 
is not uniformly computable from this index. So, @ cannot provide the “answer”.) In 
practice, we usually forego such details, and simply say things like, “The question 
whether a ©° property is true can be answered by @'”)”, etc. 

We wrap up this section with one further, equally important classification specif- 
ically for Ds relations. This uses limit approximations. 


Definition 2.6.4. Let f be a 2-ary function. For each x € w, we write lim, f(x, y) | 
if there is a z such that (4w)(Vy > w)[f(x, y) = z]. In this case, we also write 
limy f(x, y) = z. 


Theorem 2.6.5 (Limit lemma, Shoenfield). Let g be a finitary function on w. A 
finitary relation P is Ave if and only if there is a g-computable function f : w* > 2 
such that limy f(x,y) | for all x, and limy f(x,y) = 1 ifand only if P(x). 


The next corollary, which follows immediately from the limit lemma and Post’s 
theorem, is a commonly used application. 


Corollary 2.6.6. Let g be a finitary function on w. For every g'-computable function 
h: w — w there exists a g-computable function f: w? — w such that h(x) = 
limy f(x, y) for all x. 
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We can extend the definition of the arithmetical hierarchy to describe relations 
on functions (and so, sets), which will be useful for our work in Section 2.8. 


Definition 2.6.7. Fix n € w and let g be a finitary function on w. 


1. AX? (also II}'*) relation on w® is ane € w such that & °8(0) | for all 
f € w®. We identified e with the relation R that holds of f € w® if and only if 
©; °8 (0) |= 1. 

2.A =e relation on w® is a g-computable sequence (0, (e; : i € w)) such that 


for each i, e; is a 28 relation on w”. If e; is identified with the relation R;, 
then we identify (0, (e; : i € w)) with the relation R that holds of f if and only 
if R; holds of f for some i. 


3.A 1B ie relation on w® is a g-computable sequence (1, (e; : i € w)) such that 
g 


for each i, e; isa bi relation on w%®. If e; is identified with the relation R;, 
then we identify (0, (e; : i € w)) with the relation R that holds of f if and only 


if R; holds of f for all 7. 


yr and 8 relations on 2” (or some other subset of w®) are defined in the obvious 
way. For instance, for each function g, the relation “f(x) < g(x) for all x” isa 10 ag 
relation on w®. For an infinite set X, the relation “Y is an infinite subset of X” is a 
i relation on 2”. 


2.7 Relativization and oracles 


As we have already remarked, the central focus of modern computability theory is 
really relative computability. This is especially the case for the aspects of computabil- 
ity theory that pertain to reverse mathematics, as we will discover in Chapter 5. In 
a certain sense, everything can be relativized. For example, if we wish to relativize 
to some finitary function g, we would change all instances of “computable” to “g- 
computable”, “@’” to “g’”, “2” to Hee * etc. This much we have seen throughout 
this chapter. 

It is thus common in the subject to formulate definitions and results in unrela- 
tivized form, meaning without the use of oracles. Typically, it is easy and straight- 
forward for these to then be relativized to an arbitrary oracle g, simply by “inserting 
the oracle” appropriately, as above. Some care must be taken of course: for example, 
relativizing “X is low” to g does not mean “X’ <7 g’”” but “(g @ X)’ <r g’” (see 
Definition 2.5.6). 

We will generally follow this practice moving forward, beginning already in the 
next section. So, though a result may be stated and proved in unrelativized form, we 
may later appeal to the relativized version by saying, e.g., “Relativizing such and 
such theorem to g, ...”. Even when a theorem is formulated explicitly in terms of an 
arbitrary oracle, we will often prove only the unrelativized version, and then simply 
remark that “the full version follows by relativization’. 
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To be sure, there are also senses in which not everything relativizes. Most fa- 
mously, it is known that the different “upper cones” in the Turing degrees (meaning, 
sets of the form {d : a < d} for different degrees a) need not be isomorphic 
or even elementarily equivalent as partial orders, as shown by Feiner [105] and 
Shore[280, 281]. However, virtually everything we consider in this book will act in 
accordance with the following maxim: 


Maxim 2.7.1. All natural mathematical statements about w relativize. 


This is unavoidably (intentionally) vague and subjective, but it serves as a useful 
starting point to frame our discussions. Certainly, the few examples we will see 
of statements that do not relativize (e.g., Example 4.6.8) will not be “natural” or 
“naturally occurring” in any way. 


2.8 Trees and PA degrees 


We now come to one of the first and most important points of intersection between 
computability and combinatorics: trees. Of course, trees and tree-like structures 
permeate many branches of mathematics. In computability theory and reverse math- 
ematics, they arise as frameworks for building approximations. For example, to 
produce a function f: w — w with certain properties, we might first define f(0), 
then define f(1) using f(0), then define f(2) using f(1) and f(0), etc. If we have 
only one choice for f(s + 1) for each s, we will build a simple sequence of approx- 
imations. But suppose we have multiple choices (perhaps even infinitely many) for 
f(s +1) given the sequence f(0),..., f(s). In that case, we will end up with is a 
tree of approximations to f, and there are multiple possibilities for what f can be 
depending on which “path” we take through this tree. 
The following definitions will help make these comments more precise. 


Definition 2.8.1. Fix X € 2° andT C X<®. 


1. T is a tree if it is closed under initial segments, i.e., if a € X and 6 < a then 
Be X.IfU CT is also a tree then U is a subtree of T. 

2. T is finitely branching if for each a € T there are at most finitely many x € X 
such that ax € T. 

3.T is bounded if there is a function b: w — w such that for every a € T, 
a(i) < b(Z) for alli < |a|, in which case we also say T is b-bounded. 

4. An infinite path (or just path) through T is a function f: w — X such that 
f | k €T for every k € w. The set of all paths is denoted [7]. 

5. T is well founded if T has no path, 1.e., if [T] = ©. 

6. a € T is extendible if a < f for some f € [T]. The set of all extendible a € T 
is denoted Ext(T). 


Naturally, X<° itself is a tree. Also, every bounded tree is finitely branching. The 
elements of a tree T are also sometimes called nodes. If a < £ belong to T and 


38 2 Computability theory 


|8| = |a| + 1, then B is an immediate successor of a in T. If a € T and there is no 
x € X such that ax € T (or equivalently, there is no 6 € T such that 6 > a) then a 
is called a leaf or end node of T. 

The following shows that the sets of paths through trees have a natural connection 
to the clopen topology on X® (Definition 1.5.3). 


Proposition 2.8.2. Fix X € 2° andC © X®. Then C is closed in the clopen topology 
if and only if C = [T] for some tree T C X<®. 


Proof. First suppose C is closed. Write X° \ C as Ugey [[@]] for some U € X<®. 
Then T = {a2 € X<® : (Wk < lal)[a@ > k ¢ S]} is a tree with C = [T]. Now 
suppose C = [T]. We may assume () € T, since otherwise T = @ and C = @. Let 
U={ae Xs’: a€¢TAacfla|—1eT}. Then X®\C=Useyv lle. Oo 


As is evident from the discussion above, our main interest in trees is in the 
objects we can construct using them, i.e., the paths. In general, there is no simple 
combinatorial criterion for a tree to have a path. But for finitely branching trees, 
there is, as given by the following hallmark result. 


Theorem 2.8.3 (Konig’s lemma). Let T C w<® be a finitely branching tree. The 
following are equivalent. 


1. T is not well founded. 
2. T is infinite. 
3. For every k € w, T contains a string of length k. 


Proof. (1) — (2) : If f € [T] then f [k € T for each k € w, and these form 
infinitely many nodes in T. 


(2) — (3) : We prove the contrapositive. Suppose k € w is such that T has no node 
of length k. Being a tree, T is closed under <, and therefore every node of T must 
have length some j < k. But as T is finitely branching, and contains at most one 
node of length 0 (the empty string), it follows by induction that T has finitely many 
nodes of length any such /. Ergo, T is finite. 


(3) — (1) : We define a sequence ag < a; < --- of elements of T such that, for all 
Ss € w, |as| = s and there are infinitely many 6 € T witha, < f. 

This suffices, since the first property ensures LU, ¢,, @s is a path through T. By 
assumption, the empty string, (), belongs to T. Let this be ao, noting that this has 
the properties. Assume next that as has been defined for some s. By hypothesis, 
there are infinitely many 8 € T such that ays < £. Since T is a tree, this means in 
particular that a, has at least one extension a € T of length s + 1. But since T is 
finitely branching, there must be at least one such @ such that a < £# for infinitely 
many £ ¢€ T. Let this be a4. Clearly, a;4; has the desired properties. oO 


It is worth emphasizing a restatement of the equivalence of (1) and (2) above, 
which is that [TJ] is nonempty if and only if T is infinite. We often invoke K6nig’s 
lemma in this form. 
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More commonly, “K6nig’s lemma” is used to refer just to the implication from 
(2) to (1) above, under the assumption that T is finitely branching. Importantly, 
this implication fails if T is not finitely branching. For example, consider the set 
containing the empty node, (), and for each x € w the singleton node (x). This is an 
infinite subtree of w<“, but clearly it has no path. A different example can be used 
to show that if T is not finitely branching then the implication from (3) to (1) fails as 
well. 


Corollary 2.8.4. [fT © w<® is an infinite, finitely branching tree, then there is an 
f € [T] such that f <_ Ext(T). 


Proof. By the proof of the implication (3) — (1) in the theorem, noting that for 
every s, we can take a54; = (ua € Ext(T))[as < a]. im 


Per our discussion in Section 2.2.3, if X is a computable set we say that a tree 
T C X<“ is computable if (the set of codes of elements of) T is a computable subset 
of w. We say T is computably bounded if it is b-bounded for some computable 
function b: w — w. We will see later, in Exercise 4.8.5, that for the purposes of 
understanding the complexity of the paths through such trees it suffices to look at 
computable subtrees of 2<®. (Of course, we also care about relativizing the above 
effectivity notions, both for definitions and results, but these are straightforward to 
formulate and prove from the unrelativized versions, as discussed in the previous 
section.) 

Computable, computably bounded trees arise naturally throughout computability 
theory, with the interest usually being in the paths rather than the trees themselves. 
To understand the complexity of such paths, we begin with the following. 


Proposition 2.8.5. Suppose T C w<® is a tree that is infinite, computable, and 
computably bounded. Then Ext(T) <z @’, and so [T] contains a member f <y 9’. 


Proof. For each k € w, there are only finitely many strings of length k, so the set 
of codes of such strings is bounded. Moreover, this bound is primitive recursive in 
k. It follows that if T is a computable subtree of 2<“, there is a computable relation 
P such that for all k € w and all (codes for) a € 2<®, P(a@,k) holds if and only 
if (AB € T)[B = a A |B| = k]. Thus, a € Ext(T) @ @ € T A (Vk = |a|)P(a,k), 
meaning Ext(T) is a m1 set, and hence is computable in @’ by Post’s theorem 
(Theorem 2.6.2). The rest follows by Corollary 2.8.4. oO 


2.8.1 II° classes 


Recall the definition of £2 and I1° relations on w® or 2“ from Definition 2.6.7. We 
use this to define the following central concept. 


Definition 2.8.6. Fix n € w and let € {22,11}. A subset C of w® (or 2”) is a 
I class if there is a I relation R on w® (respectively, 2“) such that f € C — R(f), 
for all f in w® (respectively, 2“). 
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Thus, ©° and T° relations are associated to 2° and T° classes, respectively, in a 
way similar to how formulas are associated with the sets they define. In Chapter 5, 
we will see how to make this analogy formal. Primarily, we will be interested in ny 
classes, which are related to trees through the following result. 


Proposition 2.8.7. Let C be a subset of w®. The following are equivalent. 


I.Cisa m1 class. 
2. There is a computable tree T € w<® such that C = [T]. 
3. There is a computable set A C w<® such that C = {f : (Wk)[f Tk € A]}. 


The same result holds if w® is replaced by 2® and w<® by 2<®. 


Proof. We prove (1) — (2). That (2) — (3) is obvious, since T itself is a set 
A of the desired form. And that (3) — (1) follows from the fact that the relation 
R(f) @ (VA)Lf tk € Al] is ny (in the sense of Definition 2.6.7). 

So fix a iy class C © w”. By definition, there is a computable sequence 
(e0,€1,---) Of indices such that o! (0) { for every f and f € C if and only if 
(Vi) [o!, (0) |= 1]. Define T to be the set of all a € w<® such that for all i, if 
O27, (0) | then ®2 (0) |= 1. By the use principle (Proposition 2.4.7), T is com- 
putable, and clearly if a is in T then so is every 6 < a. Hence, T is a tree. Now by 
assumption on the e;, 


f EC © (Vi)(As) [OF (0)[s] L= 1] 
© (vi)(Vs)[®2,(0)[s] | &f,(0)[s] = 1]. 


Hence by monotonicity of computations, 


f€EC © (vk) (vi [eZ (0) | @f (0) = 1] 
© (Vk) f tk €T] 
o f ¢€[T]. 


The proof is complete. oO 


mm classes allows us to refer to elements of [7] (for a tree 7) without first needing 
to define T as a set of strings. For example, given a computable set X, we may wish 
to consider the class C of all Y ¢ X. Then C is a ny class (in 2”), since “Y is a 
subset of X” is a my relation. Indeed, we could write 


C= [{o € 2°: (Vi < |o|)[o() =1 aie X}}I. 


Sometimes we will write things like this out explicitly, but usually it will be more 
convenient to describe directly the property we want the elements of C to have. 
We can assign indices to my classes, as follows. 


Definition 2.8.8. An index of a i class C isa i index for a computable tree T as 
in Proposition 2.8.7 (2). 
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Alternatively, we could code a m1 class by the index of the computable sequence 
(eo, €1,---) Specifying a If relation that defines C. But the proof of Proposition 2.8.7 
is completely uniform, so we may move between these two types of indices uniformly 
computably, and therefore the specific choice does not matter. Under either choice, 
every m1 class has infinitely many indices. 

As indicated at the end of the last section, our main interest in the sequel will be in 
bounded trees, typically subtrees of 2“. In this setting, combining K6nig’s lemma 
(Theorem 2.8.3) with Proposition 2.8.7 yields the following simple fact, which is so 
commonly used that it deserves a separate mention. 


Corollary 2.8.9. If C is a mm class in2® andT © 2° is a tree with C = [T], then 
C # @ ifand only if T is infinite. 


Proposition 2.8.2 has the following corollary, using the fact that the Cantor space 
is compact. 


Corollary 2.8.10. 


1. TfC isa my class in 2® then C is a compact set. 
2. If Co 2 C, D ++: is a sequence of nonempty mm classes in 2® then (\, Cs is 
nonempty. 


Proof. Both parts are general topological properties of compact spaces. Part (2) can 
also be seen as follows. For each s, fix a computable tree 7, with [T;] = C,. We may 
assume 7) 2 7; 2 ---. By hypothesis, each 7; is infinite. Suppose (), Ts is finite. 
Since T € 2<“, this means there is a k such that T contains no string a € ak (i.e., NO 
binary string of length k). For each such o,, let s(o) be the least s such that o ¢ T,. 
Let so = max{s(o) : o € 2*}. Then Ts, contains no string of length k and so is 
finite, a contradiction. oO 


Remark 2.8.11 (Compactness). Suppose that C € 2 is an empty m1 class. By 
Proposition 2.8.7 (2), there is a computable A (not necessarily a tree) such that 
C={f €2°: (VA)[f tk € Al}. Thus for each f € 2, since f ¢ C, it follows that 
there is a k such that f }k ¢ A. But in fact, there is a single such k that works for 
all f € 2. If A is not a tree, we can let T = {0 € 2“® : (At € A)[o X T]}. Then 
T is not necessarily computable, but it is a tree. If A is a tree, let T = A. Either way, 
[T] = C and the existence of k follows just as in the preceding proof. Typically, for 
brevity, we say k exists “by compactness”. 


Example 2.8.12. Fix a Turing functional ®, and consider the mm? class C ofall f € 2° 
such that ®f (0) 7. To be precise, f € C if and only if (Vs)[®/ (0)[s] 7]. (Note that 
the matrix of this formula is a computable predicate in f.) Suppose C = @. Then 
by compactness, there is a k such that for all f € 2, (As < k)[®f (0)[s] |]. (This 
illustrates an interesting computability theoretic fact: a functional which converges 
on every oracle, cannot converge arbitrarily late.) 

To see this another way, consider the computable tree T such that C = [T]. T is 
the set of all o € 2“ such that ®7 (0) T. If C = @, let k be such that no o of length 
k belongs to T, meaning ®7 (0) | and hence (As < k)[®7'5(0) |]. Then for all 
f € 2%, (As < k)[®/ (0)[s] |], as before. 
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We will encounter many examples of iW classes in this book. An important one 
is provided by the following definition. 


Definition 2.8.13. A function f: w — w is diagonally noncomputable (or DNC for 
short) if, for every e € w, if ®.(e) | then f(e) # ®,.(e). 


Proposition 2.8.14. The set of all 2-valued DNC functions is a m1 class. 
Proof. We claim that the class C of all f € 2 such that 


(Ve)(®.(e) T V®.(e) # f(e)] 
isa m1 class. Define T to be the set of all o € 2<@ such that 
(Ve < |o|)[®-(e) [lol] lL o(e) # ®(e)]. (2.1) 


Then T is a computable subtree of 2<®. If f is 2-valued and DNC, then (2.1) 
obviously holds for o = f | k, for every k. Hence, f € [T]. 

Conversely, suppose f € [7] and fix e such that ®,(e) |. Let s be such that 
®.(e)[s] | and consider any k > s. Then also ®,(e)[k] |. Now, by Convention 2.4.8 
we also have k > e. Since f [k ¢€ T, taking o = f [k in (2.1) we obtain that 
f(e) # ®.(e). Thus, f is DNC. Hence, C = [T]. Oo 


It is instructive to reflect on this proof for a moment. Crucially, T above is not 
the tree of all o € 2“® such that for all e < |o|, 7(e) # ®.(e); that is Ext(T). T 
itself contains many other nodes. To better understand this, let us present the proof 
in slightly different terms. 

Imagine we are building T by stages. At stage s, we must decide which strings 
o of length s to put into T. Of course, we must do this consistently with making T 
be a tree, so if s > 0 we cannot add any string that does not already have an initial 
segment in T of length s — 1. But in addition, we must make sure that [TJ] ends up 
being the set of 2-valued DNC functions. 

To this end, we can check which e < s satisfy that ®.(e)[s] |, and then exclude 
any o of length s such that 7(e) = ®-(e) for some such e. Any other o of length 
s we must add to T, however. This is because, as far as we can tell at stage s, 7 
looks like an initial segment of a DNC function. And we cannot add o at any later 
stage, since that is reserved for longer strings. This allows the possibility that o is 
put into 7, and there is still an e < |o| with ®,(e) |= o(e), only the least s* such 
that ®.(e)[s*] | is larger than |o |. Thus, o is in fact not an initial segment of any 
DNC function. Now, we cannot remove o from T. But we can ensure o ¢ Ext(T), 
by “pruning” T above o to ensure no path through T extends it. That is, at the stage 
s*, we exclude all t > o of length s* from T. Then for every function f > a, we 
have f [ s* ¢T and hence f ¢ [T]. 

Note that, by definition, @’ is the set of those e € w such that ®.(e) |, and so 
@’ can compute a 2-valued DNC function. In particular, the mi? class of Proposi- 
tion 2.8.14 is nonempty. Moreover, it is clear that no DNC function can be com- 
putable. 


Corollary 2.8.15. There is a nonempty ny class with no computable member. 
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2.8.2 Basis theorems 


Although m1 classes do not always have computable members, can anything be said 
about their elements in general? It is easy to see that for every f € 2 there is 
a nonempty mm class C that does not contain f. (Let C be the set of all g € 2° 
with g(0) # f(0).) Hence, positive results about members of 1a classes are usually 
presented in terms of collections of functions. nonempty 1A class contains some 
function from the collection. 


Definition 2.8.16. A collection 8 of functions f € 2° closed downward under <7 
is a basis for mm classes if every nonempty mn class C contains an element in 8. 


Of course, different m1? classes may contain different elements of a basis 8. Perhaps 
the easiest basis to recognize, in light of Proposition 2.8.5, is the set of functions 
computable from @’. 


Theorem 2.8.17 (Kleene). The collection of @'-computable functions is a basis for 
nm? classes. 


Proof. By Proposition 2.8.5, @’ can compute the sequence 
Oo N01 S°:°: 


in the proof of the implication (3) — (1) of Theorem 2.8.3. Hence, the path U,<. Os 
is computable in 9’. Oo 


This result can be improved in many ways. The literature is full of basis theorems 
of various kinds, with applications in many different areas of computability theory. 
For a partial survey, see Diamondstone, Dzhafarov, and Soare [66]. For our purposes, 
we now State and prove two especially prominent basis theorems which we will apply 
repeatedly in the sequel. 

The method of proof here is known as forcing with my classes or Jockusch— 
Soare forcing. We will study forcing systematically in Chapter 7, and see further 
applications of Jockusch—Soare forcing there (particularly in Section 7.7). But we 
can use the basic method already without needing to know the theory of forcing 
as a whole. The idea is as follows. We are given a computable tree T, and wish 
to produce a g € [T] with certain computational and combinatorial properties. We 
build a nested sequence 

T 2T, 2DTI,2::: 


of infinite computable subtrees of 7, with each JT. ensuring some more of the 
properties we wish g to have in the end. Typically, these properties are organized 
into infinitely many requirements, Ro, R,,..., and each element of [T7.] satisfies, 
e.g., R; for all i < e. In this way, any element of ()\,[T.] will satisfy all the R,, as 
desired. (Of course, the fact that there is at least one element in (),[T.] follows by 
Corollary 2.8.10.) 

We begin with arguably the most famous basis theorem of all, due to Jockusch 
and Soare [172]. 
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Theorem 2.8.18 (Low basis theorem; Jockusch and Soare [172]). The collection 
of low functions is a basis for mm classes. 


Proof. Let C be a nonempty Be set, and fix an infinite computable tree T ¢ 2“ 
such that C = [T]. As described above, we build a sequence 


T 2T, 2DTIh2::: 


of infinite computable subtrees of 7. In this proof, we also build a @’-computable 
function j: w — 2, such that along with our sequence the following requirement is 
satisfied for each e € w: 


Re: (Vf € [Tesi] le € f’  j(e) = 1]. 


Thus, we will ensure that either e € f’ forall f € [Te4;], ore ¢ f’ forall f € [Te41], 
and which of these is the case is recorded by the @’-computable function 7. By 
taking g € (),[T7<], it follows that for all e we have e € g’ if and only if j(e) = 1, so 
8’ <r j <7 @’. In fact, 7 just is the jump of g. 

We proceed to the details. Let 7) = T, and suppose that for some e we have 
defined 7, and j [ e. We define 7.4; and j(e). Define U = {0 € T. : BF (e) T}. This 
is a computable subtree of T., so by Proposition 2.8.5, @’ can determine whether 
or not U is infinite. If it is, let T.4; = U. Now if f is any element of [T..,] then 
o! (e) T, soe ¢ f’. We then define j(e) = 0, and now we have clearly satisfied R.. 
Suppose next that U is finite. Then there is a k such that no string o € T, of length 
k belongs to U, meaning that every such string satisfies BY (e) |. Choose any string 
o € Ext(7.) of length k, and let T.4; = {7 € Te : TX @ Vo XT}. This time, if 
Sf € [Te+1] then e € f’. We let j(e) = 1, so again R, is satisfied. Oo 


Actually, the g we chose above is the unique path through (),7.. (See Exer- 
cise 2.9.11.) Some proofs of the low basis theorem emphasize this by adding in- 
termediate steps that determine “more and more” of g along the way explicitly, but 
this is not needed. 

As a side note, combining Theorem 2.8.18 and Corollary 2.8.15 yields a (some- 
what roundabout) proof that there exist noncomputable low sets. 

The next basis theorem we look at uses the following computability theoretic 
notion. 


Definition 2.8.19. Fix S € 2°. 


1. S is hyperimmune if no computable function dominates its principal function, 
Ps. 

2. S is of, or has, hyperimmune free degree if no S* <r S is hyperimmune. 
Otherwise, S is of, or has, hyperimmune degree. 


In general, to show that S is hyperimmune we exhibit for each e € w an x such that 
®,(x) T or ®, (x) < ps(x). The following shows that such sets can be found Turing 
below every noncomputable AS set. 
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Theorem 2.8.20 (Miller and Martin [213]). If @ <7 S <_ @’ then S has hyper- 
immune degree. 


On the flip side, we have the following famed characterization of having hyperim- 
mune free degree. 


Theorem 2.8.21 (Miller and Martin [213]). A set S has hyperimmune free degree 
if and only if every S-computable function is dominated by a computable function. 


For this reason, some authors use the term “computably dominated” instead of 
“hyperimmune free’. 


Theorem 2.8.22 (Hyperimmune free basis theorem; Jockusch and Soare [172]). 
The collection of functions of hyperimmune free degree is a basis for ny classes. 


Proof. As above, let C be given and fix T so that C = [T]. We again define a 
sequence 
To DT; DT22:---, 


of infinite computable subtrees of 7. Our goal this time is to satisfy the following 
requirement for each e € w: 


Re: (WF € [Tos1])[®F is not total] 
V (Bh <p (Vf € [Teas ])[®2 is total A (Vx) [®F (x) < h(x)]]. 


If, at the end of the construction, we choose some g € ().[T.], then we will have that 
for all e € w, if ®% is total then it is dominated by a computable function. Hence, by 
Theorem 2.8.21, g will have hyperimmune free degree, as desired. 

Let us proceed to the construction. Let Jy) = T and suppose inductively that for 
some e we have defined T,. € T. For each x, define Ux = {a0 € Te : BY (x) T}, which 
is a subtree of 7, uniformly computable in x. If, for some x, U,, is infinite, we let 
To, = U,. Now every f € [T.4;] satisfies of (x) T, hence R, is satisfied trivially. 

Suppose that U, is finite for every x. We let 7.4; = T.. Then in particular, of is 
total for every f € [Te+1]. We now define a computable function h as follows. On 
input x, h searches for the least £ such that ®Y (x) | for all o € Te41 of length ¢, 
and then it outputs the supremum of the values of all these computations. (Notice 
that € must exist, else U, would be infinite.) Since every f € [T..;] extends some 
o € Tex; of length ?, we have o! (x) < A(x). And since this holds for all x, R. 
holds. im 


The final basis theorem we prove will have important applications in Chapters 8 
and 9. This theorem is also due to Jockusch and Soare [172]. 


Theorem 2.8.23 (Cone avoidance basis theorem). Fix C <7 @. The collection of 
functions g € 2 such that C €ry g is a basis for mi classes. 


Proof. Fix C and T with C = [T]. We define a sequence 
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Tp DT; DID 2D-:-, 


of infinite computable subtrees of T. We aim to satisfy the following requirement for 
each e € w: 
Re: (VF € [Teri I)[@2 # Cl]. 


Choosing any g € (),[Te], it follows that C ¢7 g, as desired. Let T = To, and 
suppose that for some e we have defined T,. For each x, define U,, as in the proof of 
the hyperimmune free basis theorem: that is, U, = {0 € T. : BY (x) T}, which is a 
computable subtree of T.. As before, if U,. is infinite for some x, we let 7.4; = Ux, 
so that every f € [Te4;] satisfies o! (x) T. In this case, then, R, is satisfied. 
Suppose that U,. is finite for each x. We claim there is an x anda o € Ext(T.) 
such that ®Y (x) |# C(x). Suppose not. Then for each x, there exist numbers k, 
and y, such that for every o € T, of length k,, we have ®Y (x) |= y,. This is 
because U, is finite, so there is a k such that ®2(x) | for every t € 7, of length k. 
By hypothesis, no such t with ®7 (x) # C(x) is extendible, so by K6onig’s lemma, 
there is a k* > k such that no o € T, of length k* extends any such rt. Thus, every 
o €T, of length k* satisfies ®Y (x) |= C(x), and we can consequently take ky = k* 
and y, = C(x). But k, and y, can be searched for, and found, computably. (Notice 
that finding them does not require knowing the value of C(x).) Thus, y, = C(x) can 
be found computably for each x, which makes C computable, a contradiction. This 
proves the claim, that there is an x and ao € Ext(T.) such that ®Y (x) |# C(x). Fix 
such ano and let Te4, = {7 €Te: TX OVO XT}. Nowevery f € [T.4] satisfies 
of (x) l= ®Y (x) # C(x), so R- is satisfied. Qo 


2.8.3 PA degrees 


In our discussion above, we saw one oracle, @’, that could compute a member of 
every nonempty 1M class. As it turns out, there are many such oracles. 


Definition 2.8.24. A function f € 2@ is of, or has, PA degree, written f > @, if 
the collection of f-computable functions is a basis for iN classes. Relativized to an 
oracle g, we say f has PA degree relative to g, written f > g. 


Thus, f >> @ if and only if every nonempty my classes has an f-computable member, 
if and only if every infinite computable subtree of 2<“ has an f-computable path. 
Note that if f =7 f*, g =r g*, and f > g then f* > g”*. So being “of PA degree” 
really is a degree property, justifying the terminology. 

The abbreviation “PA” stands for “Peano arithmetic’, as it follows by an old result 
of Scott [274] and Solovay (unpublished) that f has PA degree if and only if f 
computes a complete consistent extension of Peano arithmetic. For a modern proof 
of this equivalence, as well as several other characterizations of having PA degree, 
see Soare [295, Section 10.3]). Here, we restrict to the properties that will be most 
useful to us in the rest of the book. 
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Theorem 2.8.25. Fix f € 2°. The following are equivalent. 


lf >@. 

2. f computes a 2-valued DNC function. 

3. Every 2-valued partial computable function has a total f -computable extension. 

4. There is an f-computable function d: w* — 2 such that d(eg,e1) € {eo, e1} 
for all eo, e1 € w, and if (Ai < 2)[Be,(0) T] then ®g(ey,c,) (0) TF. 


Proof. (1) > (2) : Immediate from Proposition 2.8.14. 


(2) — (3) : Suppose ®, is a 2-valued partial computable function. By the $7” 
theorem (Theorem 2.4.10), we can fix a computable function S: w — w such 
that Ds(x)(z) = ®e(x), for all x and z. Then in particular ®.(x) | if and only 
if Bs(.)(S(x)) |, in which case the two computations agree. By (2), let g be an 
f-computable 2-valued DNC function, and define h = 1 — (go S). Then h is 
f-computable and 2-valued, and for all x such that ®.(x) | we have g(S(x)) = 
b= D5 (x) (S(x)) = 1 - ©. (x), so h(x) = B(x). 


(3) — (4) : Consider the partial computable function defined by 


P(e0,€1) = (mi < 2)(As)[®e, (0) [5] | A(Vt < 5) ®e,_,(O)[¢] TI 


for all eg, e, € w. By (3), let h be an f-computable 2-valued function extending p, 
and define d(eo, €1) = €1—n(ep,e,) for all eo, e1. Now if ®.,(0) T and ®z,_,(0) | the 
P(€0, €1) = €1-i, SO d(€0, €1) = @;, as desired. 


(4) — (1) : Fix an infinite computable tree T. By the S!” theorem, we can fix a 
computable function S: w* > w such that for every o € 2“ andi < 2, 


a €w)(Wr)[(t> ci A|t| =k) 27 ¢T] ifx =O, 
Ds(o,i)(X) = 4, 

I otherwise, 

for all x € w. Thus, Bs, (0) 7 if and only if oi € Ext(T). Let d <y f be as given 
by (4). Then d can computably construct a sequence o9 < a) <-:- of elements of 
T with |o,| = s for all s. Namely, let oo = (), and suppose inductively that we have 
defined o, € Ext(T) for some s € w. Then there must be ani < 2 such that o,i € 
Ext(T), and hence such that ®s,,,,;)(0) 7. By assumption, d(S(a5,0), S(os, 1)) = 
S(os,i) for some such i, and d can compute this 7 since ®g(5(o,,0),5(0,,1)) (1) = 
®s(o,,i(1) l= i. We then let o54; = asi. So Us os is a d-computable (hence 
f-computable) path through 7, and since T was arbitrary, f > @. oO 


The equivalence of (1) and (2) means that there is a tree (namely, the one con- 
structed in Proposition 2.8.14) whose paths are, up to Turing equivalence, precisely 
the functions of PA degree. It follows that if 8 is any basis for ny classes then there 
is some f € 8 of PA degree. So, for example, not only does every nonempty nm 
class have a low member, but there is a single low f that computes a member of 
every my class. 
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Part (4) merits a bit of elucidation. One application is as follows. Suppose we 
are given a computable sequence of pairs of computable sets ((Ao,s, Al,s) : 5 € @). 
Then @’ can tell us, for each s, whether Ao,, is empty, whether A;,, is empty, or 
whether both are empty. But if we simply wanted to know one of Ao,s or Aj,s that 
is empty, assuming at least one is, then (4) says exactly that we do not need @’, 
as any f > @ will do. Or suppose we wanted to know one of Ao,, or Aj,s that is 
infinite, assuming at least one is. Of course, @’’ can answer this for each s, but by 
(4), relativized to @’, so can any f > @’. Note that being infinite is a mm property 


of a computable set, hence by Post’s theorem (Theorem 2.6.2), a ga property. 

We conclude this section by looking at some properties of > as an ordering of 
2”. It is easy to see that > is transitive. By relativizing Corollary 2.8.15, we see that 
> is also anti-reflexive and hence anti-symmetric. The following establishes that >> 
is dense. 


Proposition 2.8.26 (Density of the PA degrees, Simpson [284]). Fix f,g € 2°. 
If f > g, there exists h € 2® such that f > h > g. 


Proof. Relativize Proposition 2.8.14 and Theorem 2.8.25 to g to obtain an infinite 
g-computable tree T ¢ 2*“ such that if / is any path through T then h > g. Define 
U to be the set of all pairs of binary strings (a, 7) such that |o| = |r|, t € T, and 
furthermore, 

(Ve < |o|)[®2(e) | o(e) # ®2(e)]. (2.2) 


Clearly, U is a g-computable set, and if (0,7) € U then also (o [k,t [k) € T for 
all k < |o| = |t|. We may thus regard U as a g-computable tree in the obvious way. 
Since T is infinite, it also clear that so is U, i.e., that for every k there isa (o,T) € U 
with |o| = |t| = &. Thus, as f > g, f can compute a path through U, which is a pair 
of functions (Ao, A) such that (Ao [ k, 4 | k) € U for all k. By definition, h is also a 
path through T, so h > g. Moreover, ho satisfies that for all e, ho(e) # oh (e) if the 
latter converges, so hg is DNC relative to h and hence by Theorem 2.8.25, hg > h. 
Since ho <7 f, it follows that f >> h > g, as was to be shown. oO 


One interesting consequence of this is the following, seemingly stronger property of 
having PA degree. Namely, consider any f > @, and fix g so that f > g > ©. Then 
not only does every nonempty nm class have a member h <7 /, but it has a member 
h <r g and therefore a member / so that f > h. This is a useful tool in certain 
model constructions in reverse mathematics that we will encounter in Section 4.6. 


2.9 Exercises 


Exercise 2.9.1. Show that the following functions and relations are primitive recur- 
sive: 


1. f(s) =2°. 


2. R(s) = “s is a power of 3”. 
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3. M(s,t) = “sis a multiple of 7”. 
4. g(s) returns the smallest prime factor of s. 
5. If R(n) is primitive recursive, and so are f, g: w — w, then the function 


io ff) RO=0, 
g(s) R(t) =1. 


is also primitive recursive. 


Exercise 2.9.2. Show that the following functions and relations are primitive recur- 
sive, and then complete the proof of Theorem 2.2.13. 


1. p(d) is the (7 + 1)st prime: p(0) = 2, p(1) = 3, p(3) =S,.... 
2. a(s) returns the number of distinct prime factors of s. 
3. x(s, 1) as in Definition 2.2.12. 


Exercise 2.9.3. Show that every infinite c.e. set has an infinite computable subset. 


Exercise 2.9.4. Prove that a set A € w is computable if and only if A is finite or A 
can be computably enumerated in strictly increasing order. 


Exercise 2.9.5. Justify the second sentence of Theorem 2.5.4 by explicitly showing 
how to construct the function f described, using the effective indexing from the 
normal form theorem. 


Exercise 2.9.6. This is a warmup for Exercise 2.9.7. Show that the following sets are 
not computable: 


1. {e €w: 0 € range(®,)}. 
2. {e € w: range(®,) is bounded}. 


Exercise 2.9.7 (Rice’s theorem, Rice [258]). Suppose C is a set of partial com- 
putable functions and let Jc = {n : ®, € C}. If I¢ is computable then C = @ or C 
contains all the partial computable functions. 


1. Prove this result using the recursion theorem. 
2. Re-prove the result by showing that, if C is not empty and does not contain every 
partial computable function, then @’ <y Ic. 


Exercise 2.9.8. Suppose that F'(s, t),...,f,) is a partial computable function. There 
is an index e for which 


O.(t),...,tn) ~ At,...,tn.F(e,t,..-5tn)- 
This was the form of the fixed-point theorem originally proved by Kleene [180]. 


Exercise 2.9.9. Show there are indices m and n so that y,,(0) = and g,, (0) = m. 
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Exercise 2.9.10. Prove that there is a partial computable function f: w — w for 
which there is no total function g: w — w that agrees with f whenever f is defined. 
Colloquially: not every partial computable function can be extended to a (total) 
computable function. (Hint: Consider f(e) = (us)[T(s, e, e) = 0].) 


Exercise 2.9.11. Show that if X,Y € 2° and X’ = Y’ then X = Y. 


Exercise 2.9.12. In Definition 2.3.1, the operator M looks for the least y such that 
f(y, x) l= 0 and f(y*,x) |# 0 for all y* < y. We could define an alternate 
operator M’ such that M’(f)(x) is simply the least y such that f(y,x) |= 0, with 
no consideration of whether f(y*,x) T for y* < y. Let C be the class of partial 
computable functions. 


* Let C’ be the smallest class of finitary functions which contains the primi- 
tive recursive functions and is closed under generalized composition, primitive 
recursion, and applying M’ to functions that are total. 

* Let C” be the smallest class of finitary partial functions which contains the 
primitive recursive functions and is closed generalized composition, primitive 
recursion, and applying M’ to arbitrary functions (partial or total). 


Prove that C = C’ and C is a proper subset of C”’. 


Exercise 2.9.13. Suppose that T is a tree that is definable by mm? formula. That is, the 
set of nodes that are not in the tree is c.e. Show that there is a computable tree T’ 
that has the same set of paths as T. 


Exercise 2.9.14. Let T be the tree constructed in Proposition 2.8.14. Prove that 
Ext(7) is not computable. 


Exercise 2.9.15. Show that for all k > 2 there exists a AS sequence of sets 
(Po,..-,Px-1) whose members partition w and for each i < k, w \ P; is hyper- 
immune. (A set P such that it and its complement are hyperimmune is said to be 
bi-hyperimmune. With k = 2, this results gives the existence of AY bi-hyperimmune 
sets.) 


Exercise 2.9.16. A function f: w — w is fixed point free (or FPF) if (Ve) [Wy (e) # 
W.]|. Show that every DNC function computes an FPF function, and conversely. 
(Hint: For the converse, apply the S'” theorem to get a computable function h such 
that Whe) = Wo. (e) if B.(e) |, and Whe) = © if ®.(e) T. Let g be FPF and 
consider g 0 h.) 


Chapter 3 ®) 


Check for 


Instance-solution problems et 


As mentioned in the introduction, there is a natural way to translate mathematical 
theorems into problems and vice versa. In this chapter, we investigate this relationship 
and begin to collect some of its implications for the program of reverse mathematics. 
As we will see via numerous examples, the translation is not always straightforward, 
and not always unique. We also study coding, or representations, of problems in 
terms of numbers and sets of numbers, which in turn enables us to obtain our 
first assessments of a problem’s computability theoretic strength. The measures of 
complexity introduced in this connection will be important throughout the sequel. 


3.1 Problems 


We start our discussion with the central definition. 


Definition 3.1.1 (Instance-solution problem). An instance—solution problem, or 
just problem, is a partial function P: A — P(8) for some sets A and B. The 
elements of dom(P) are called the instances of P, or P-instances and, for each x € J, 
the elements of P(x) are called the solutions to x in P, or P-solutions to x. 


Following Blass [16, Section 4], it is instructive to think of a problem as a two player 
game between a “challenger” and a “responder”, the former being tasked with playing 
instances of the problem (the “challenges”) and the latter with playing solutions to 
these in turn (the “responses”). Heuristically, the “hardest” problems are then those 
that are the most difficult for the “responder” (e.g., a hypothetical problem having 
an instance with no solutions), and the “easiest” are those that are the most difficult 
for the “challenger” (e.g., a hypothetical problem having no instances). 

Definition 3.1.1 is quite general, allowing for both of these extreme possibilities. 
That is, it could be that dom(P) = @, or that there is an x € dom(P) with P(x) = @. 
Clearly, these are somewhat unusual cases and, indeed, they only arise in specialized 
situations. For our purposes, all problems will be assumed to have at least one 
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instance, and for each instance, at least one solution. We will nonetheless develop a 
variety of measures to gauge the “difficulty” of various problems. 

For completeness, we make some remarks about notation in the literature. The 
possible partiality of P is sometimes expressed by writing P: CC A — P(8). 
When, as in our case, it is assumed that P(x) # @ for every x € dom(P) then P is 
regarded by some authors instead as a multivalued function (multifunction), written 
P: C A = 8, with the values of P(x) being the P-solutions to x. (This is the 
practice in computable analysis in the style of Weihrauch [323, 324].) Other authors 
prefer to regard P as a binary relation, consisting of pairs (x, y) with x € dom(P) and 
y € P(x). Un a different context, this conception of an instance—-solution problem 
was independently proposed by Vojta8 [318]; see Blass [16], Section 4.) 

In our case, we will largely suppress all the formalism. We will refer simply 
to problems, their instances and solutions, without naming the sets A and 8 or 
committing to how exactly P is built out of them. The only exception will be when 
we wish to underscore that the P-instances are a subset of some larger set, so that 
thinking of P explicitly as a partial function becomes useful. 


Example 3.1.2. Let X be any nonempty set. Then Idx is the problem whose instances 
are all x € X, with each such x having itself as a unique solution. As a function, we 
have Idy: x > {x}. 


Example 3.1.3. The GCD problem is the problem whose instances are all pairs (x, y) 
of nonzero integers, with each such pair having gced(x, y), the greatest common 
divisor of x and y, as its unique solution. 


The preceding example illustrate an important aspect for our discussion, which is 
that problems can be obtained from theorems. (Here, GCD is a problem form of the 
result that every pair of nonzero integers has a greatest common divisor.) We discuss 
this further in Section 3.2. The following is a more interesting example which also 
underscores an important caveat: the way a problem is obtained from a theorem need 
not be unique. We will explore this issue in detail in Section 3.3. 


Example 3.1.4. Consider a partial order <p on a set X. An infinite descending 
sequence in <p is a sequence (x, : n € w) of elements of X such that xn41 <p Xn 
for all n. The partial order is well founded if it has no infinite descending sequence. 
A well ordering is thus a well founded linear order. A linear extension of <p is a 
linear order <; on X such that x <;, y for all x, y € X withx <p y. It is not difficult 
to see that every partial order admits a linear extension. But Bonnet [18] showed that 
every well founded partial order on w has a well founded linear extension. 

There are several ways to think of Bonnet’s theorem as a problem. The most 
obvious is to take the instances to be all well founded partial orders on w, and the 
solutions to each such instance to be its well founded linear extensions. Alternatively, 
we may feel it more fitting to regard this is as a partial problem on the set of all 
partial orders on w, with domain the subset of partial orders that happen to be well 
founded. Of course, as noted above, this is really just a notational distinction. 

Yet a third possible way to think of this as a problem, that differs from the previous 
two more significantly, is the following: the instances are all partial orders on w, and 
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the solutions to a given instance are its infinite descending sequences (in the case 
that the instance is not well founded), or its well founded linear extensions (in the 
case that it is). 


Problems can be combined with themselves or other problems to produce new 
problems. We list a few examples. 


Definition 3.1.5 (Parallelization). Let P and Q be problems. 


1. The parallel product of P with Q, denoted P x Q, is the problem whose instances 
are all pairs (Xo, X;) where Xo is a P-instance and Xj is a Q-instance, with the 
solutions to any such (Xo, X;) being all pairs (Yo, Y;) such that Yo is a P-solution 
to Xo and Y; is a Q-solution to Xj. 

2. The parallelization of P, denoted P, is the problem whose instances are se- 
quences (X; : i € w) where each X; is a P-instance, with the solutions to 
any such (X; : i € w) consisting of all sequences (Y; : i € w) such that Y; a 
P-solution to X; for each 7. 


The name “parallelization” comes from the interpretation of P in which we are trying 
to solve an infinite collection of instances of P simultaneously, “in parallel”. We will 
see that this can be much more difficult than solving each instance on its own. In a 
sense, to solve the parallelization of a problem we must be able to solve individual 
instances “uniformly”. 

For this chapter and the next, we will use the notions of parallel product and 
parallelization largely in a technical way, to illustrate various concepts. More natural 
examples of these operations will appear later on. 

We will typically only be interested in problems all of whose instances and 
solutions are subsets of w. As we will see, many mathematical objects can be 
represented by subsets of w, so this still leaves a very broad range of problems. For 
example, this is the case for all three of the problems mentioned in the preceding 
example. We will discuss various such representations in Section 3.4 below. 


3.2 VA theorems 


As hinted in Chapter |, the vast majority of problems we will consider come from 
Vai theorems. Recall that these are problems having the form 


(vx) [¢(x) > (Ay)W(x, y)]. (3.1) 


There is a canonical way to translate (3.1) into a problem: we take as its instances 
all x such that g(x) holds, and as the solutions to an instance x are all y such that 
w(x, y) holds. 

We can illustrate this by looking first at KOnig’s lemma (Theorem 2.8.3), discussed 
in Section 2.8. Recall that this asserts that every infinite, finitely branching tree has 
an infinite path. We restate this below to make it more clear that this has the same 
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form as (3.1). We also give an important variant, called weak K6nig’s lemma, which 
we will encounter frequently in the sequel. 


Definition 3.2.1 (K6nig’s lemma and weak K6nig’s lemma). 


1. KLis the following statement: for every infinite, finitely branching treeT C w<®, 
there exists an f € w® which is a path through T. 

2. WKL is the following statement: for every infinite tree T C 2<“, there exists an 
f € 2° which is a path through T. 


We can then define the associated problem forms. 


Definition 3.2.2 (K6nig’s lemma and weak K6nig’s lemma, problem forms). 


1. KL is the problem whose instances are all infinite, finitely branching trees T ¢ 
w<®, with the solutions to any such T being all its paths. 

2. WKL is the problem whose instances are all infinite trees T C 2<®, with the 
solutions to any such T being all its paths. 


We deliberately use the same abbreviation for the VA theorem and its associated 
problem form. This cuts down on notation, but also reflects the practice of shifting 
freely between the two perspectives. As we will see, for the vast majority of ex- 
amples we encounter it will be convenient to use the theorem and problem forms 
interchangeably. 

For another example, consider the well-known infinitary pigeonhole principle 
asserting that a finite partition of the natural numbers must have an infinite part. 
Here, we think of a partition of w into k parts as a function f: w — k, so that two 
numbers belong to the same part provided they are assigned the same value i < k by 
f.A finite partition is thus more generally a function f: w — w with bounded range. 
So formally, we take the infinitary pigeonhole principle (IPHP) to be the statement: 
for every f: w — w with bounded range, there exists ani € w so that f~! {7} C w 
is infinite. The problem version associated to this principle is then the following. 


Definition 3.2.3 (Infinitary pigeonhole principle). |PHP is the problem whose 
instances are functions f: w — w with bounded range, with the solutions to any 
such f being alli € w such that f~!{7} C w is infinite. 


One point of consideration here is whether an instance of IPHP should instead be 
a pair, (k, f), where k € w and f: w — k, 1.e., whether the number of parts of the 
partition should be given explicitly as part of the instance. At first blush, we may not 
consider this to make much of a difference—a function w — w with bounded range 
is a function w — k for some k, after all. But the issue is that there is no clear way 
to determine such a k from f alone (see Exercise 3.9.2). Indeed, as we will see in 
Chapter 4, there is a sense in which the version of IPHP with the number of parts 
specified is strictly weaker than the one in the definition above. We will explore other 
ways in which a principle can potentially have multiple interpretations as a problem 
in the next section. 
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We conclude this section with one final problem, which will be a prominent 
example throughout this chapter and the next, as well as the central focus of all of 
Chapters 8 and 9. This is the infinitary Ramsey’s theorem (RT), which states, for each 
fixed n > 1, that if the unordered n-tuples of natural numbers are each assigned one 
of finitely many colors, then there exists an infinite subset of the natural numbers all 
n-tuples of which are assigned the same color. To make this more precise, we make 
the following definitions. 


Definition 3.2.4. Fix numbers n,k > 1 andaset X Cw. 


1. [X]"={F CX: |F| =n}. 
2. A k-coloring of |X|" is amap c: [X]" — k. 
3. Aset Y C X is homogeneous for c: [X]|" — k if c is constant on [Y]”. 


When k is fixed or emphasis on it is unnecessary, we can speak simply of finite 
colorings or colorings, instead of k-colorings. If Y C X is homogeneous for c, and 
c takes the value i < k on all elements of [Y]”, then we also say Y is homogeneous 
for c with color i. 

Given {xo,...,Xn-1} € [w]”, we usually write c(xo,...,Xn-1) in place of 
c({xo,.--,Xn-1}) for brevity. We write (xo,...,Xn-1) € [X]” as shorthand for 
{xo,.--,Xn-1} € [X]" and xo < +++ < x,_}. Inthis way, we tacitly identify [X]” with 
the set of increasing n-tuples of elements of X, and so sometimes also use x, y,... 
to denote elements of [X]". If n > 1 and we are given x = (xo,...,Xn-2) € [X]"! 
and y > X,-2, we may also write c(x, y) as shorthand for c(xo, ...,Xn—2, Y)- 


Definition 3.2.5 (Infinitary Ramsey’s theorem). 


1. Forn, k > 1, RT; is the problem whose instances are all colorings c: [w]” — k, 
with the solutions to any such c being all its infinite homogeneous sets. 
2. Forn > 1, RT” is the problem whose instances are all colorings c: [w]" — k for 


some k > 1, with the solutions to any such c being all its infinite homogeneous 
sets. 

3. RT is the problem whose instances are all colorings c: [w|” — k for some 
n, k > 1, with the solutions to any such c being all its infinite homogeneous sets. 


Thus, if we go back to thinking of problems as functions, then RT is just the union 
of RT” for all n > 1, and RT” for each fixed n is the union of RT; for all k > 1. Each 
instance of RT; is an instance of RT”, which is in turn an instance of RT. 

As implied by the “infinitary” adjective, there is also a finitary analogue of RT, 
which we discuss in Definition 3.3.6 below. Moving forward, however, we will simply 
say “Ramsey’s theorem” in place of “infinitary Ramsey’s theorem’, and use this to 
always refer to the problem RT defined above. 

As a theorem, Ramsey’s theorem is a powerful generalization of the infinitary 
pigeonhole principle, which is essentially Ramsey’s theorem for singletons. However, 
as problems, RT! and IPHP are quite different. For starters, there is the technical 
distinction between [w]! and w itself, but we can ignore this and thereby regard the 
two problems as having exactly the same set of instances. The main difference is that 
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a solution to an instance of RT! is an infinite set of numbers, whereas a solution to an 
instance of IPHP is a single number. In particular, there are instances of IPHP having 
unique solutions, whereas there are no such instances of RT'. We will soon develop 
means to formally show that IPHP is a strictly “harder” problem than RT!, precisely 
because of this distinction. Yet another version of IPHP appears in Definition 3.3.2 
below. 

We add that the translation process, from VA theorem to problem, also works in 
reverse. To each problem P we can associate the V3 theorem 


(Vx) [x is a P-instance — (Ay)[y is a P-solution to x]]. 
This motivates the following convention. 


Convention 3.2.6 (Defining theorems and problems). In the sequel, when the trans- 
lations are clear, we will usually define either a problem or the statement of an V3 
theorem. Except in cases where more than one translation is possible, we will then 
use the same initialisms and abbreviations for both. 


As in the “forward” translation, we will typically only look at problems whose 
instances and solutions are “represented” by subsets of w. 


3.3 Multiple problem forms 


The method of deriving a problem from an V3 theorem described above is very 
specific to the syntactic form of (3.1). Different but logically equivalent syntactic 
forms can thereby give rise to different problems. For example, if we rewrite (3.1) as 


(Vx) (Ay) [>¢@) VW, y)], (3.2) 


then the above method results in the problem having as instances all x (in the ambient 
set), with an instance x having as solutions either all y if g(x) does not hold, or all y 
such that w(x, y) holds if g(x) does hold. We saw this concretely in Example 3.1.4. 

Such distinctions are not always purely formal, however. While many theorems 
we encounter have a canonical presentation in the form of (3.1), for others, multiple 
forms will be equally natural, and these theorems will thus not admit any single 
“right” problem version. An interesting example comes from looking at contrapos- 
itives. For instance, consider the simple principle that every finite union of finite 
subsets of w is finite. Its problem form is the following. 


Definition 3.3.1 (Finite unions principle). FUF is the problem whose instances are 
families {F; : i € w} of finite sets with F; = @ for almost all i, with the solutions to 
any such collection being all numbers b > Uje,, Fi. 


Note that the instances above are thus all possible finite collections of finite subsets 
of w, again without having to explicitly specify how many sets we are dealing with. 
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On the other hand, we could easily restate the principle by saying that in any finite 
collection of subsets of w whose union is infinite, at least one of the subsets must be 
infinite. The problem form is then the following generalized version of IPHP. 


Definition 3.3.2 (Infinitary pigeonhole principle over general sets). General-IPHP 
is the problem whose instances are families {X; : i € w} of subsets of w whose 
union is infinite and X; = @ for almost all i, with the solutions to any such collection 
being all 7 such that X; is infinite. 


Thus IPHP is the restriction of General-IPHP to the case where all the X; in the 
instance are disjoint and U;<,, Xi = w. 

The two principles FUF and General-IPHP are equivalent, of course, as mathe- 
matical statements. Not so for the associated problems, which can be discerned not 
just by their different sets of instances, but also—more tellingly—by the complexity 
of their solutions. Given a finite collection of finite subsets of w, determining that a 
number is an upper bound on the union is mm? relative to the collection. By contrast, 
given a finite collection of subsets of w with infinite union, determining one of the 
sets that is infinite is in general 1h relative to the collection. Based on this, we should 
expect General-IPHP to be “harder” than FUF in a precise sense, and this is indeed 
the case (Proposition 4.3.5). 

By varying a problem, we can also similarly gleam something about the com- 
plexity of a problem’s instances (as opposed to just its solutions, or the relationship 
between the two). In Example 3.1.4, we saw two problem versions of Bonnet’s theo- 
rem, one whose instances were well founded partial orderings and solutions were well 
founded linear extensions, and one whose instances additionally included ill founded 
partial orderings, with solutions to these being infinite descending sequences. We 
can consider a similar “disjunctive version” of WKL. 


Definition 3.3.3 (Weak K6nig’s lemma, disjunctive form). Disj-WKL is the prob- 
lem whose instances are all trees T € 2<“, with the solutions to any such T being 
either all its infinite paths (if T is infinite) or all £ € w such that T contains no string 
of length ¢ (if T is finite). 


For another example, we can turn to the well-known Heine—Borel theorem. 


Definition 3.3.4 (Heine-Borel theorem for [0, 1]). 


1. HBTjo,1] is the problem whose instances are pairs of sequences ((xx, yx) 1 k € 
w) of real open intervals covering [0,1], with solutions being all € € w such 
that each x € [0, 1] belongs to (xz, yg) for some k < @. 

2. Disj-HBT/0,1 is the problem whose instances are pairs of sequences ((xx, yx) : 
k € w) of real open intervals, with solutions being all € € w such that each 
x € [0, 1] belongs to (xx, yz) for some k < @, or all x € [0, 1] that do not belong 
to (xz, yx) for any k. 


In all these examples, we thus have a version that places some (or more) of the onus 
of describing a problem’s instances on the “solver’, isolating a particular aspect of 
the instances without which the problem is trivial or uninteresting. 
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One way to think of the above issues is in terms of constructive mathematics. 
Without the law of excluded middle, we cannot simply assert that a partial ordering 
either is or is not well founded, that a collection of intervals does or does not cover 
the real unit interval, or that a tree is or is not infinite. On this view, if we are unable to 
produce a well founded linear extension of a given partial order, or a finite subcover 
of a given set of intervals, or an infinite path through a given tree, then we should 
construct an explicit witness for why this is the case. The question of how difficult 
exhibiting such a witness is says something about how “hard” the problem is, and so 
is of interest in many situations we encounter. 

Likewise, it is well-known that contrapositives are not necessarily constructively 
equivalent. We saw a reflection of this between FUF and General-IPHP, but a more 
famous example is that of WKL and Brouwer’s fan theorem. To state it, we recall that 
a bar is a subset B of 2<® such that every X € 2 extends some o € B. 


Definition 3.3.5 (Brouwer’s fan theorem). FAN is the problem whose instances 
are bars, with a solution to any bar being all its finite subsets that are also bars. 


See, e.g., Berger, Ishihara, and Schuster [15] for a discussion of the relationship 
between FAN and WKL in the constructive setting. We will compare WKL and FAN 
more carefully in Proposition 4.3.8. 

Finally, a theorem may admit multiple problem forms simply because its usual 
statement has multiple interpretations, even from an ordinary mathematical view- 
point. An example of this is the finitary Ramsey’s theorem. (This should be compared 
with RT, the infinitary Ramsey’s theorem, defined in Definition 3.2.5). The usual 
way this is stated is as follows. 


Definition 3.3.6 (Finitary Ramsey’s theorem). FRT is the following statement: 
for alln,k,m > 1 with m > n, there is an N € w such that for every finite set X with 
|X| > N, every c: [X]” — k has a homogeneous set H € X with |H| > m. 


The least N as above is called the Ramsey number for n, k, and m (or more properly, 
the hypergraph Ramsey number, if n > 2). We will use the notation R7’(m) to denote 
this number, though this is nonstandard. (In the combinatorics literature, it is usually 
denoted by R,,(m, ...,m), with k many m terms.) What problem should we associate 
to FRT? Here it seems we have (at least) two possibilities. 


¢ The problem of bounding Ramsey numbers is the problem whose instances are 
all triples of integers (n, k,m) with m > n, with the solutions to any such triple 
being all N > Ri(m). 

¢ The problem of finding Ramsey numbers is the problem whose instances all 
triples of integers (n,k,m) with m > n, with the solution to any such triple 
being R?(m). 


And indeed, both of these are problems associated to the finitary Ramsey’s theorem 
that mathematicians work on. (As an aside, they are also vastly different in diffi- 
culty: computing Ramsey numbers explicitly is much harder than bounding them. A 
notorious case in point is the problem of computing R3(6). This is still open, and 
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has been since the 1950s, though it is known that R3(6) < 165. For more about 
Ramsey numbers, see, e.g., the textbook of Graham, Rothschild, and Spencer [128].) 
We could also consider a third problem form, closer to that of RT, involving actually 
finding homogeneous sets for colorings defined on sufficiently large finite sets. 


3.4 Represented spaces 


While the general notion of a problem does not require that the instances and solutions 
belong to any particular set, our interest in analyzing problems from the point of 
view of computability theory requires these to be objects to which computability 
theoretic notions can be applied, which is to say, ultimately, subsets of w. This is 
facilitated by the notion of representation. 


Definition 3.4.1. A representation of a set X is a partial surjection 6: 2° — X. The 
pair (X, 6) is called a represented space. 


The idea is that the elements of X are coded by subsets of w: if 6(X) = x for some 
X Cwandx € X then X is a code for x. Here, we will identify n € w with {n} C w, 
and thereby allow for codes to also be natural numbers (instead of only sets of natural 
numbers). Note that 6 above need not be injective, so an element of X may have 
multiple codes under a given representation. 

For example, the elements of each of Z and Q can be coded by natural num- 
bers using the pairing function. Similarly for w<”, using codes for sequences as 
in Definition 2.2.12. A different but equally straightforward representation is the 
following. 


Example 3.4.2. Consider the set w® of all functions f: w — w, which is clearly not 
a set of subsets of w. Let 6 be the following map. The domain of 6 consists of all 
X Cw such that for each n € w there is exactly one m € w with (n,m) € X. Given 
such an X, let 6(X) be the set of all ordered pairs of natural numbers (n,m) such 
that (n,m) € X, noting that 6(X) is a function w — w. Then 6 is a representation 
of w®. 


Representations can be easily combined to produce new represented spaces from 
old ones. Recall that if we have a finite collection Xo,..., Xn, of subsets of w 
then (Xo,..., X,-1) denotes the join of these sets, i.e., {(x,i) : i < n}, which is 
again a subset of w. And if instead we have an infinite collection Xo, Xi,..., then 
(X; : i € w) denotes {(x,i) : i € w}. 


Definition 3.4.3. Let (Xo, 60) and (X1, 61) be represented spaces, and let 6: 2° > 
Xo XX, be the partial map whose domain consists of all (Xo, X;) with Xo € dom(6q) 
and X; € dom(6;), and such that 


6((Xo0, X1)) = (60(Xo), 61(X1)). 
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It is evident that 6 above is a representation of the set Xo x X. In particular, if (X, 6) 
is a represented space then this yields a natural representation of X” for each fixed 
n € w. We can also obtain representations of other kinds of self-products, as follows. 


Definition 3.4.4. Let (X, 6) be a represented space. 


1. Let 6cy: 2° — X<® be the partial map whose domain consists of all 
(Xo, ..-,Xn-1), Where n € w, X; € dom(6) for all i < n, and 


b<w({Xo, tee Xn-1)) a (6(Xo), sees 6(Xn-1)). 


2. Letd.,: 2° — X® be the partial map whose domain consists of all (X; : i € w), 
where X; € dom(6) for all 7 € w and 


6y((X; 11 € wy) = (6(X;) 11 ew). 


Here, we clearly have that 6<,, and 6,, are representations of X<“ and X®, respec- 
tively. 

One takeaway of the above definitions for us is that some basic sets that ought 
to have representations indeed, do. Thus we can, for example, speak of pairs of 
functions w — w, or tuples or finite or infinite sequences of the same, and formally 
this will refer to an element of one of the represented spaces per the preceding 
definition. 

Representations are not unique, and a space may even admit multiple equally nat- 
ural representations. For example, while each rational number g can be represented 
by a number (n,m) € w with m # 0 and n/m = gq (which defines a noninjective 
representation of Q), the same rational could also be coded by the set of all such 
numbers (n,m) (and this would define an injective representation instead). In this 
specific instance, the choice is inconsequential for our purposes: from a code of the 
first kind we can uniformly compute the code of the second, and given a code of the 
latter kind, any element of this code (which is a set) serves as a code of the first kind. 

In other situations, we cannot so easily move between different representations, 
so an explicit choice is required. A case in point is representing the real numbers 
and functions defined on the real numbers, which we discuss in the next section and 
then in far more depth in Chapter 10. 

Still, once a representation has been established, we usually largely suppress it 
for ease of terminology. More precisely, we abide by the following. 


Convention 3.4.5. When no confusion can arise, we identify elements of a repre- 
sented space with their codes and forego mentioning the representation explicitly. 
We thus formally work with the domain of a representation rather than the repre- 
sented set itself, and pretend that the two are one and the same for the purposes of 
stating definitions, theorems, and problems. 


So, moving forward, we will use familiar notations like Q*® or Z x w® and move 
freely between thinking of these as the original sets and as the represented spaces. 
Similarly, when talking about tuples of sequences of elements (from some repre- 
sented space), we may use (-- - ) and (- - -) interchangeably, always formally referring 
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to the latter. And when we say, e.g., that a subset of Q*” is computable, or that an 
element of Z x w® computes @’, it should always be understood that we are actually 
referring to these objects’ codes. 


3.5 Representing JR 


There are many possible representations of R: any surjective map from 2” will do. 
However, an arbitrary such map will not necessarily preserve any of the analytic 
properties of the reals, which will make working with this representation difficult. 
We consider this issue in greater detail in Section 10.1. For now, we fix the following. 


Definition 3.5.1 (Representation of R). 


1. A code for a real number is a sequence (qn : n € w) of rational numbers such 
that |¢n — dn+i| < 2~” for all n. 

2. The map dp has domain the set of all codes of real numbers, and if (gn : n € w) 
is any such code then dg({qn : n € W)) = liMy—0 gn € R, in which case we 
also say that (q, : n € w) is a code for the real number limy.0 dn. 


Using Cauchy sequences here is perhaps expected, but the bound of 2~” may be less 
so. We could demand instead that |gn — qm| < f(n) for all n and m > n, where 
f:@ — Qis any nonincreasing computable function with limy—o f(n) = 0, and 
get basically the same representation (see Exercise 10.9.1). The real advantage here 
is that this definition is both structurally and computationally well-behaved, as we 
can see from the following definitions. 


Definition 3.5.2. Let x = (gn : n € w) and y = (ry, : n € w) be codes for real 
numbers. 


1. —x is the code (-—qyn :n € w). 

2.x +y is the code (gn41 +1n+1 1 € w). 

3. x < yif there exist m, no € w such that gq, +2°” < ry for all n > no. 

4.x = y if for every m € w there exists mp € w such that |g, — r,| < 27” for all 
n > no, and x # y otherwise. 


Now, from codes x and y (as oracles), we can uniformly compute the codes —x and 
x+y. Ifx # y, we can also uniformly computably determine whether x < y ory < x 
(see Exercise 3.9.6). However, as is well-known from constructive mathematics, 
there is no computable procedure to tell whether or not x = y (Exercise 3.9.7). 

Since Definition 3.5.3 permits us to speak of sequences of codes of real numbers, 
we can also define the following. 


Definition 3.5.3. Let (x, : k € w) be a sequence of codes of real numbers, with 
Xk = (dkn in € w) for each k. 


1. The sequence (x, : k € w) is bounded if there exist codes for real numbers y 
and z such that y < x, < z forall k. 
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2. The sequence (xx : k € w) converges to a code for a real number x = (qn: n € 
w) if for each m € w there is a ky € w such that for each k > ko there is an 
no € w with |gk.n - Gn| < 27” for all n > no. 

3. A subsequence of (xgx : k € w) is a sequence of codes for real numbers 
(yj; 1 j © w) such that there is an infinite set J = {kg < kj <---} C w with 
yj =X, forall j. 


Ina similar fashion, we can easily transfer a host of other familiar definitions from the 
reals to codes for reals, such as other arithmetical operations, the distance between 
two real numbers, etc. 

Notably, all of these notions behave as the original notions do when passed through 
Og. For example, x = y precisely when dg(x) = Op(y); Op (x + y) = Op(X) + OR(y); 
(xz: k € w) converges to x precisely when (dp(x,) : k € w) converges to dp (x); 
and so on. We therefore need not dwell on the formal distinctions between the actual 
notions for the reals and the corresponding notions for the codes, and following 
Convention 3.4.5, will usually speak simply of “the reals” instead of “codes for 
reals”, when convenient. Consider the following example: 


Definition 3.5.4 (Bolzano—Weierstrass theorem). BW is the problem whose in- 
stances are bounded sequences of real numbers, with the solutions to any such 
sequence being all its convergent subsequences. 


As it is appears here, this is the direct translation of the Bolzano—Weierstrass theorem 
into the parlance of instance—solution problems, with no heed for representations 
or similar concerns. But having now fixed a representation for R, we may regard it 
instead as a statement about codes for reals, and the various properties and operations 
from Definitions 3.5.2 and 3.5.3 plugged into their respective locations. 

Notice that unlike with functions on countable sets, cardinality considerations 
mean we cannot hope to represent all functions R — R. However, we can hope 
to represent continuum-sized subclasses, of which the continuous functions are an 
especially important example. We will develop such a representation in Chapter 10, 
along with a generalization to other separable metric spaces. For now, we define a 
representation of continuous real-valued functions on the real closed unit interval, 
which the more general definition will extend. 


Definition 3.5.5. A code for a continuous real-valued function on [0,1] is a uni- 
formly continuous function f: [0,1] AN Q- R. 


Here, we are availing ourselves of the fact that every continuous real-valued function 
on [0, 1] is the unique continuation of some code as above. If x = (gn : n € w) is 
a code for a real number with 0 < x < 1, then there is an no such that 0 < gy, < 1 
for all n > no, and we write f(x) for mys ny noo f (Gn) (which exists by uniform 
continuity of f). Note that real-valued functions on Q are just particular countable 
subsets of Q x R, so the codes in Definition 3.5.5 can themselves by represented 
using Definitions 3.5.1 and 3.5.3. 

With this in hand, we can formulate, e.g., the intermediate value theorem as a 
problem whose instances and solutions are (coded by) subsets of w. 
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Definition 3.5.6 (Intermediate value theorem). IVT is the problem whose instances 
are continuous functions f: [0,1] — R with f(0) < 0 and f(1) > 0, with the 
solutions to any such f being all real numbers x such that 0 < x < 1 and f(x) =0. 


3.6 Complexity 


We now restrict to problems whose instances and solutions are subsets of w in an 
attempt to measure the “difficulty” of solving such problems from the point of view 
of our computability theoretic investigation. Since solving a problem is the task of 
taking a given instance and producing a solution, our aim is essentially to understand 
the computational resources necessary to carry out this process. We begin with the 
“easiest” problems in this sense. 


Definition 3.6.1. A problem P admits computable solutions if every instance X of P 
has a solution Y <7 X. 


It is worth emphasizing that the instances and solutions of a problem that admits 
computable solutions need not themselves be computable. Indeed, as sets of numbers 
they may be arbitrarily complicated. Rather, it is the relationship between the two 
that is at issue: more precisely, the fact that it requires no additional computational 
power to go from any instance to at least one solution to it. It might thus be more 
honest to say that “P admits solutions computable in its instances”, but that would 
be rather unwieldy. 

Many problems admit computable solutions simply because each instance (com- 
putable or not) has a computable solution. In particular, problems whose solutions 
are natural numbers, like IPHP or FUF or HBT /9,1], admit computable solutions. For 
an example not of this sort, consider RT: given a partition c: w — 2, of arbitrary 
complexity, each of the sets {x : c(x) = 0} and {x : c(x) = 1} is computable from c, 
and at least one of the two is infinite set and so is a solution to c. This is one situation 
where the parallelization can be useful: although RT admits computable solutions, 


RT} does not (Exercise 3.9.4). 

It is also worth stressing that even problems with numerical solutions may not 
be objectively “easy” in any practical sense. A good example of this is the problem 
of finding Ramsey numbers, discussed following the statement of FRT in Defini- 
tion 3.3.5. 

In essentially all problems we encounter, including all those naturally derived from 
Va theorems, the relationship between instances and solutions follows Maxim 2.7.1. 
This means that such a problem satisfies a computability theoretic property con- 
cerning its computable instances and their solutions if and only if, for every A € w, 
the problem satisfies the relativized property concerning its A-computable instances 
and their solutions. Thus, to describe more complex types of problems, we can look 
only at computable instances and then generalize by relativization. 

For example, we expect that a problem whose computable instances all have 
at least one computable solution will actually satisfy that for every A C w, every 
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A-computable instance X has a solution Y <z A. This is equivalent to the problem 
admitting computable solutions. By the same token, a problem having at least one 
computable instance with no computable solution should satisfy that forevery A C w, 
there is at least one A-computable instance X having no solution Y <z A. This is 
stronger than merely not admitting computable solutions. And since the instances of 
the problem may not be closed upward under <j, it is also stronger than saying that 
for every A there is an A-computable instance X with no solution Y <7 X. 
More generally, we make the following abstract definition. 


Definition 3.6.2. Let K = {K(A) : A € 2} be a class of subsets of 2”. Let P be a 
problem. 


1. P admits solutions in K if for every A C w, every A-computable instance X of 
P has a solution Y € K(A). 

2. P omits solutions in K if for every A C w, there is an A-computable instance X 
of P having no solution Y € K(A). 


Our preceding example thus works with K(A) = {Y : Y <r A} for every A. For 
brevity, we can say P admits or omits low solutions if (1) or (2) above hold with 
K(A) = {X : (A @Y)’ <q A’} for every A. Analogously, we can define what 
it should mean for a problem to admit or omit low? solutions, or AS solutions, or 
arithmetical solutions, or solutions in any other natural computability theoretic class. 
Again, we should really perhaps say that “P admits solutions that are low over its 
instances’, “P admits solutions that are arithmetical in its instances’, etc., but we 
prefer the shorter nomenclature for brevity. In all these examples, we are thinking 
of K as (the class of sets satisfying) some relativizable property, and K(A) as (the 
class of sets satisfying) the relativization to the specific set A. 

To be sure, it is possible for a problem to neither admit nor omit solutions in a class 
K. In practice, this never happens outside of certain specific, and usually deliberately 
constructed, situations (see Exercise 3.9.3). For this reason, it is common simply to 
say that a problem does not admit solutions of a particular kind as a synonym for 
what we are calling here omitting. But technically, these are different notions. 

The following observation is obvious: 
Proposition 3.6.3. Let K = {K(A) : A © w} and K’ = {K"(A) : A © w} be 
classes of subsets of 2° with K(A) © K'(A) for all A. Let P be a problem. If P 
admits solutions in K then it admits solutions in K". If P omits solutions in K" then 
it omits solutions in K. 


Typically, the classes K we consider in this context have K(A) closed under <r 
for every A. This way, admitting a solution in the given class can be regarded as 
an upper bound on the “difficulty” of the problem, and omitting solutions in it as 
a lower bound. So for example, if we know that a problem admits low solutions 
but not computable solutions, we may view the computability theoretic relationship 
between its instances and solutions as lying somewhere between the two classes. And 
we regard such a problem as “harder” than one that admits computable solutions, 
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but “easier” than one that does not admit AS solutions, and so on. This is our first 
means of comparing the relative complexities of (certain) problems. 

We move to some specific examples. Three commonly encountered and related 
measures of complexity are the following, which use the notion of a set D having 
PA degree relative to a set A, denoted D > A, as defined in Definition 2.8.24. 


Definition 3.6.4. Let P be a problem. 


1. P admits solutions in PA if for every A © w and every set D > A, every 
A-computable instance X of P has a solution Y such that A ® Y <7 D. 

2. Padmits PA avoidance if for every A C wandevery C <& A, every A-computable 
instance X of P has a solution Y such thatC K A@Y. 

3. P codes PA if for every A C w, there is an A-computable instance X of P with 


A®@Y > A for every solution Y to X. 


We note for completeness that these can all be formulated in terms of Definition 3.6.2. 
(Doing so does not really add any insight to these notions, but we mention it to 
highlight that these classes fit the general framework.) However, for (1) and (2) we 
actually need to use families of classes of subsets of 2”. For example, for (1), first 
define Ky (A) = {Y : Y <y f(A)} for every function f: 2° — 2 and set A. Then 
P admits solutions in PA precisely if it admits solutions in Ky = {Ky (A) : A € 2°} 
for every f satisfying f(X) >> X for every X. For (3), a simpler definition suffices: 
for each A let K(A) = {Y : A®Y > A}. Then P codes PA precisely if it omits 
solutions in K = {K(A) : A € 2%}. 
The most obvious problem to look at in connection with these notions is WKL. 


Proposition 3.6.5. WKL admits solution in PA and codes PA. 


Proof. That WKL admits solutions in PA follows directly from the definition. That 
WKL codes PA follows from relativizing Proposition 2.8.14, that there exists an 
infinite computable tree each of whose paths has PA degree. oO 


Admitting solutions in PA is again an “upper bound” style result, while coding PA is 
a “lower bound” style result. So the preceding proposition can be seen as saying that 
the PA degrees precisely capture the complexity of the problem WKL. This is what 
we should expect. The following is an easy corollary. 


Proposition 3.6.6. Let P be a problem and let K be any relativizable class that forms 
a basis for m1 classes. If P admits solutions in PA then P admits solutions in K. 


Thus, admitting solutions in PA entails admitting low solutions as well as hyper- 
immune free solutions, etc. (In particular, WKL admits low solutions as well as 
hyperimmune free solutions, so this terminology reflects what we should expect.) 
As we will see, proving that a given problem admits solutions in PA often involves 
“reducing” the problem to WKL in an appropriate sense. This will be made precise 
in Proposition 4.2.5. 

The next definition gives two additional notions of complexity of a problem. 
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Definition 3.6.7. Let P be a problem. 


1. P admits cone avoidance if for every A © w and every C <7 A, every A- 
computable instance X of P has a solution Y such that C ¢7 A @Y. 

2. P codes the jump if for every A C w, there is an A-computable instance X of P 
with A’ <7 A @Y for every solution Y to X. 


In part (1), cone refers to the set {Z : C <; Z}—“‘the cone above C’”—and cone 
avoidance to being able to stay outside of this set. In applications, we often contrast 
this with not being able to avoid one specific cone, namely that of the jump, which 
is how we obtain (2). 

The parallel between cone avoidance and PA avoidance, and between coding the 
jump and coding PA, should be clear. We add the following more explicit connection. 


Proposition 3.6.8. If P admits solutions in PA then it admits cone avoidance. If P 
codes the jump then it codes PA. 


Proof. By the cone avoidance basis theorem (Theorem 2.8.23), for every noncom- 
putable set C there is aset D >> @ such that C <7 D. Now relativize to arbitrary sets 
A C w. This gives the first part. For the second, recall that A’ >> A forevery A. oO 


The canonical example of a problem coding the jump here is the following. 


Definition 3.6.9 (Existence of the Turing jump). TJ is the problem whose instances 
are all subsets of w, with the unique solution to any such set being its Turing jump. 


Clearly, TJ codes the jump. This is a somewhat technical principle that we will use 
mostly as a means of showing other problems code or do not code the jump. A more 
intrinsically interesting example is KOnig’s lemma. 


Proposition 3.6.10. KL codes the jump. 


Proof. We define a computable finitely-branching tree T C w<® with a unique path 
computing @’. The full result then follows by relativization. For ease of presentation, 
we assume ®,(e)[0] Tf for all e. Let T consist of all a € w<“ satisfying the following 
conditions for all e < |a|: 


1. if a(e) = 0 then ®,(e) [lal] T, 
2. if a(e) = s > 0 then ®,(e)[s] | and ®,(e)[s — 1] T. 


Checking whether a given a satisfies these conditions is uniformly computable in 
a, so T is computable. It is readily seen that each a € T has at most two immediate 
successors. And it is clear that T contains a single path f € w®, with f(e) = 0 
if O.(e) T, and f(e) # Oif ®.(e) | and f(e) is the least s such that ®,(e)[s] |. 
Hence, f clearly computes @’ (in fact, f =r @’). Oo 


In light of Propositions 3.6.5 and 3.6.8, we thus have a computability theoretically 
detectable distinction between weak Konig’s lemma and K6nig’s lemma (even for 
2-branching trees). As seen in the above proof, this stems from the fact that instances 
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of the latter are not constrained to 2*“, and hence a string o can have immediate 
successors that are arbitrarily far apart. 

We conclude this section by mentioning two additions to the framework of Defini- 
tion 3.6.2. The first is a specialization of the definition to the case where K represents 
a notion of computational weakness, meaning a property of sets closed downward 
under <r. The following definition was originally articulated by Wang [322] and 
Patey [243]. 


Definition 3.6.11 (Preservation of properties). Let C be a class of sets closed 
downward under <y. A problem P admits preservation of C if for every A € C, 
every A-computable P-instance X has a solution Y with A®Y €C. 


We can relate this to our earlier framework as follows. Given C, define the class 
K = {K(A): A € 2°}, where K(A) = {Y : A@Y € C}if A € C and K(A) = 2° 
otherwise. Then P admits preservation of C if and only if it admits solutions in K. 
And if P omits solutions in % then it does not admit preservation of C. The converse 
of this may not hold in general, but it is what we see in natural cases (as we should 
expect by Maxim 2.7.1). We give one important example. 


Proposition 3.6.12. Fix D > @ and let C = {A € 2° : D > A}. Then WKL admits 
preservation of C. 


Proof. Clearly C is closed downward under <r. Fix A € C and let T <y A be an 
instance of WKL. Thus T is an infinite subtree of 2”. By Proposition 2.8.26, there is 
a D* such that D >> D* > A. By definition of >, there is an f € [T] with f <r D*. 
Hence also A @ f <p D* andsoA®@ f « D. Thus A@ f € C, and since f is a 
WKL-solution to 7, the proof is complete. oO 


The second addition we make to Definition 3.6.2 is a technical variation, employed 
heavily in shading out subtle distinctions between mathematical theorems. 


Definition 3.6.13 (Strong solutions). Let K = {K(A) : A € 2°} be a class of 
subsets of 2°. A problem P admits strong solutions in K if for every A © w, every 
instance X of P has a solution Y € K(A). 


The emphasis is thus on the fact that the instance X need not be computable from 
the set A. So, for example, P admits strong PA avoidance if for every set A, every 
instance X (computable from A or not) has a solution Y such that A ®@ Y + A; P 
admits strong cone avoidance if for every set A and every C ¢7 A, every instance X 
has a solution Y such that C 47 A@Y. 

It may be hard to believe that, outside of artificial examples, a problem could 
admit strong solutions (at least, in any interesting class). In fact, this is the case. We 
will revisit this notion, and see a number of natural examples and their applications, 
in Section 4.4 and Chapters 8 and 9. 
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3.7 Uniformity 


In this section, we consider the following stronger form of admitting computable 
solutions. 


Definition 3.7.1. A problem P uniformly admits computable solutions if there is a 
Turing functional ® so that ®(X) is a solution to every instance X of P. 


We could also formulate uniform versions of many of the other complexity measures 
described in the previous section, though we will not do this in lieu of amore nuanced 
approach that we develop in Section 4.3. 

As we have seen, issues of uniformity often crop up in our discussion. Often, 
when constructing a solution computably from a given instance of some problem, 
we nonetheless break into cases that cannot be distinguished effectively. A basic 
example is IPHP. 


Proposition 3.7.2. IPHP does not uniformly admit computable solutions. 


Proof. Seeking a contradiction, suppose IPHP uniformly admits computable solu- 
tions, with witness ® as in Definition 3.7.1. Let cg: w — 2 be defined by co(x) = 0 
for all x. Then cg is an instance of IPHP, and we must have ®(co) = {0}. Let a € 2° 
be a long enough initial segment of co so that ®()(0) [= 1. Then ®(c)(0) J= 1 
for every c: w — 2 that extends a. Define c: w > 2 by c(x) = o(x) = 0 for 
x < |o| and c(x) = 1 for x > |o|. Then ®(c)(0) |= 1, even though we should have 
@(c) = {1}. Oo 


Intuitively, to solve an instance c: w — 2 of IPHP we break into cases: there are 
infinitely many x such that c(x) = 0, there are infinitely many x such that c(x) = 1. 
If exactly one of these cases is true, we cannot computably determine which. The 
above proposition basically says that this is intrinsic to the problem and cannot be 
eliminated (e.g., by a more clever argument). 

Nonuniform reasoning can be rather subtle as well. We recall IVT, the problem 
form of the intermediate value theorem, from Section 3.5. 


Proposition 3.7.3. The problem \VT admits computable solutions. 


Proof. Let f be an instance of IVT. If there is a g € Q such that f(q) = 0 then 
we are done, so assume otherwise. We define two sequences of rational numbers, 
(dn in € w) and (r, : n € w), as follows. Let go = 0 and rg = 1. Given gy, and r, 
for some n € w, form their mean, my, = (qn +1n)/2. Since my, € Q, it follows by 
our assumption that either f(m,,) < 0 or f(m,) > 0. In the first case, let gn41 = Mn 
and rn+41 = %n, and in the second, let dni) = gn and ry+; = my. By induction, it is 
easily verified that for every n € w, we have qn < Tn, f(Gn) < Oand f(r,_) > 0, and 
each of the quantities |qn41 — 9nl, \fne1 —’nl, and |¢n+1 — “n41| 1s smaller than 27”. 
It follows that (q, : nm € w) and (r, : n € w) are codes for one and the same real 
number, x, and by continuity f(x) = 0. It remains only to verify that (the code) x is 
computable from f. By the remark following Definition 3.5.2, whether f(m,) < 0 or 
f (mn) > 0 above is uniformly computable in n. Hence, the sequences (gn : n € w) 
and (r;, : n € w) are both computable, and so is x. oO 
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Of course, the nonuniformity above is where we assume f has no rational zeroes. 
Note that past this moment, our construction is entirely uniform. And indeed, the 
restriction of IVT to functions with no rational zeroes therefore does uniformly admit 
computable solutions. We can exploit this to construct instances of IVT (necessarily 
with rational zeroes) witnessing that IVT itself is not. 


Proposition 3.7.4. IVT does not uniformly admit computable solutions. 


Proof. For clarity, we will write (x,y) for points in R*, to distinguish from the 
open interval (a,b). Suppose to the contrary that there is a Turing functional ® 
mapping instances of IVT to solutions. We shall deal only with continuous piecewise 
linear functions on [0,1] M Q in this proof, which are all uniformly continuous. 
Given points (x, y) and (w, z) in Q’, let €(x,y) > (w,z) denote the equation of the line 
segment connecting these two points. 

For each n € w, define f,: [0,1] 79 Q— R by 


Srl) = di -2-7-3)9 (3 a) “ <q<, 
€(2 9-m-1y5(1,1) (9) if} <q<l. 


We have f,,(0) < 0, f,(1) > 0, and it is easy to check that | f,.41(¢)-—fn(q)| < 2” for 
each rational g. Thus, we can define an instance f of IVT by f(g) = (fn(qg) :n € w). 

Say ®(f) = (dn : n € w), and fix a finite initial segment 0 < f long enough so 
that ®(c) produces (qo, 41, 92, 93, 74). Since o is finite it can only determine the 
value of f(q) for finitely many q, and for each of these only up to a finite degree 
of approximation. Hence, we may fix an n such that o is also an initial segment of 
any function g: [0,1] MN Q — R such that for all g, if g(qg) = (rm : m € w) then 
rm = fm(q) for all m < n. Call any such g good. 

We can view f,, itself as good. (More precisely, we can define g(q) = (rm : m € w) 
where rm = fm(q) for all m < n andr», = fn(qg) for all m > n. Then as reals, 
g(q) = fn(q) for all g.) Since f,, is an instance of IVT with a unique zero at g = 5, it 
follows that g4 lies within 2~* of 5. Hence, any g which is good me also an instance 
of IVT must have a zero within 2-3 of 5 5, 1.e., in the interval (3.3 +3 

Now define h: [0,1] NQ-— Ras follows: 


€(0,-1)9(4,-2-7-3) (Q) if0<q<, 
eg eemyag-amy(@ ifs <a <5 
nia =4 ee 
(5-2-3) (2,2--3) (9) Wes<q<3, 

sod 
€(2.2--1) 94,1) (9) if; <q<l. 


That is, # is obtained from f,, by replacing f,(q) for each q in the interval (4, 2) 
by —2-"-3, and then connecting the points (3, —2-"-3) and (3, 2-"-3) go that h is 
continuous. (See Figure 3.1.) Now for each gq € [0,1] N Q define g(q) = (rm : 
m € w) where rm = fin(q) for all m < n andr, = h(q) for all m > n. Then 
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Figure 3.1. An illustration to Proposition 3.7.4. The functions f,, (solid) and h (dashed) overlaid. 


g(q) is areal number for each q. This is clear if g lies outside the interval (4. 3), 
since there h = f,,. If g does lie in the interval, then since | f,(q)| < 2~"~! we have 
|h(q) -— fn(q)| < 2", as needed. Thus, g is good and an instance of IVT, but it has 
no zero in (3, 2). The proof is complete. Oo 


3.8 Further examples 


In this section, we consider two use cases of the framework for measuring complexity 
introduced in Section 3.6. Both are problems arising from compactness principles. 
The first is the “disjunctive form” of the Heine—Borel theorem, Disj-HBT 9,1), intro- 
duced earlier. We show Disj-HBT{o,1; has the same computability theoretic bounds 
as WKL, which is not surprising since these are both problems corresponding to 
basically the same mathematical statement (compactness of the closed unit interval 
or Cantor space). 

The second principle is SeqCompact,.., which we define below. This corresponds 
to the statement of sequential compactness of Cantor space, and as we will see, it 
exhibits very different behavior. In this way, our investigation into the complexity of 
problems can also be reflective of underlying non-logical (in this case topological) 
properties. 


Proposition 3.8.1. The problem Disj-HBT (0,1 admits solutions in PA. 


Proof. We restrict to computable instances. The general case follows by rela- 
tivization. So fix a computable instance of Disj-HBT{o,1;, which is a sequence 
((Xk, VK) : k € w) of reals with x, < y, for all k. Using the definition of codes of 
real numbers, we can fix a computable sequence of intervals with rational endpoints 
whose union covers xe. (Xk, yx). Thus, without loss of generality, we may assume 
all the x, and y, are rational numbers. 
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First, we inductively define an interval 7, C¢ [0,1] for each o € 2“. Let 
Ty, = [0,1], and having defined J, = [a, b] for some o € 2*®, let Ig; fori < 2 
be [a+ ab i, b- xe) —i)]. Thus, /, is the closed dyadic interval whose position 
within [0, 1] is determined by the bits of o, with 0 denoting the “left” subinterval 
and | the “right”. 

Now construct a binary tree T as follows. Let o € 2<® belong to T if and only 
if 7, is not contained in Uke<lo| (Xk: Yk): All the xz and yx are rationals, so T is 
computable in the sequence ((xx, yz) : k € w). If T is finite, fix € such that T 
contains no strings @ € 2<® of length @. Since all of [0,1] is the union of J, for 
o € 2°, we have that every x € [0, 1] belongs to the interval (x;, y,) for some k < @. 
Then ¢ is a (computable) solution to our instance. 

So suppose next that T is infinite. Given any D > @, there is a path f through T 
computable from D. For each n, let g,, be the left endpoint of the interval J ;,. Then 
ldn-9n+i| < 2~” for all n, so (qn : n € w) defines a real number x. By construction, 
x € Ty pm for all n. Thus if x belonged to (xx, yx) for some k, then I +» would be 
contained in (xx, yx) for all sufficiently large n, which cannot be since f fn € T. 
Thus, x is an element of [0,1] not covered by the intervals (xz, yx), and hence a 
solution to our instance. It remains only to observe that x <7 f <7 D. oO 


We now describe a method of embedding subtrees of 2<“ into the interval [0, 1]. 
This will be of independent interest to us in Chapter 10. 


Lemma 3.8.2. Given a tree T C 2%, there is a uniformly T-computable enumera- 
tion (U; : i € w) of open rational intervals in [0,1] as follows. Let U = Uiew Ui- 
There is a computable injection from [T] to [0,1] \ U with a computable inverse 
from [0,1] \ U to [T]. In particular, [T| = @ if and only if U = [0, 1]. 


Proof. This construction, which is inspired by the construction of the middle-thirds 
Cantor set C, is illustrated in Section 3.8. Recall that C € [0, 1] is formed by creating 
a sequence (C,,) of closed sets and then letting C = (), ey Cn. Each set C; is a finite 
collection of 2' closed subintervals. In particular, we can identify a closed interval 
C, for each t € 2“ so that C, = ile C, and each C; is 37!*! units in width. 

We construct two sequences of open rational intervals, (J, : 0 € 2“) and 
WJg io € 25). For each o € 2“, let I, be the open middle third interval of C,, 
so that J,, is removed from C,, in the next stage of the construction of C. In particular, 
I, and I; are disjoint from one another and from C for all distinct 0, rt € 2<®. Let 
T=Ue¢le,solnC =@andIUC = [0,1]. 

Let J, be an open interval containing C, and extending an additional 3-(!7+)) 
units to either side. Thus J, and J; are disjoint unless tT < o or 0 < T. The 
endpoints of J, and J, are rational and are uniformly computable from a. 

Now, given a tree T € 2*“”, we let the sequence (U; : 1 € w) be an effective 
enumeration of the set {Ig : 7 € 2°} U {Jo : o € T}. It is immediate from the 
construction of the Cantor set that there is a computable function F': 2° — C with 
a computable inverse. Let U = ; U;. Then, for each f € 2°, we have f € [T] © 
F(f) € User Jo © F(f) € [0,1] \U. o 
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Figure 3.2. Construction from the proof of Lemma 3.8.2. The proof constructs an embedding of 
the Cantor space 2 into [0, 1]. Each sequence t € 2<“ corresponds to a closed interval C;. The 
open intervals 7; cover the complement of the Cantor set, and each open interval J; covers C;. 
The interval J.) is not shown. 


Proposition 3.8.3. The problem Disj-HBT {o,1} codes PA. 


Proof. We construct a computable instance of Disj-HBT 0,1] all of whose solutions 
have PA degree. The full result is again proved by relativizing the computable case. 

We showed in Section 2.8.1 that DNC isa m1? classs, an in particular there is a com- 
putable tree T € 2“ so that every element of [7] has PA degree. Apply Lemma 3.8.2 
with this T to obtain a computable instance (U; : i € w) of Disj-HBT {0,1}. Because 
[T] is nonempty, U = ); U; # [0,1], and so each solution to this instance is an 
element of [0, 1] \ U. Hence, each solution computes a path through T and therefore 
has PA degree. oO 


We now turn to the second of our examples in this section, which is a formulation 
of sequential compactness of 2° introduced by Hamkins (personal communication) 
under the name NonSpilit,,. In the next chapter, we will give this another name, COH, 
which is a prominent problem in the reverse mathematics of Ramsey’s theorem. 


Definition 3.8.4 (Sequential compactness of 2“). SeqCompact,.. is the principle 
whose instances are sequences R= (R; : 1 € w) of elements of 2, with solutions 
to any such sequence being all infinite S C w so that for each e, the values R;(e) are 
the same for almost alli € S. 


Proposition 3.8.5. SeqCompact,.. admits PA avoidance. 


Proof. As usual, we work only with computable instances. Our proof will relativize. 
To show SeqCompact,.. admits PA avoidance, fix acomputable instance (R; : i € w) 
of SeqCompact,... For subsets X and Y of w, write X < Y if X is finite, X C Y, 
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and Y — X > X. In this proof, we follow our convention that any computation with a 
finite set as oracle is assumed to have use bounded by the maximum of the finite set. 

We define a sequence Fo < F| < --- of finite sets, and a sequence Jp D I} 2 ::: 
of infinite computable sets, satisfying the following properties. 


1. lim, |FL| = &. 

2. Fe — Fe C I, for all e’ > e. 

3. Ifi, 7 El, then Ri(e) = R;(e). 

4. There is ann such that either of (n) j= ®,,(n) le {0, 1}, or there isno X > F. 
with X C I, such that ®X(n) le {0, 1}. 


We let S = Ue, Fe, which is then clearly a solution to (R; : i € w). For each e, we 
have Fe < S © I¢, and so condition (4) ensures that S computes no 2-valued DNC 
function and hence that $ + @. 

For ease of notation, let F_; = @ and J_; = w, and assume that for some e € w 
we have defined F,_; and J-_;. We consider two cases: 

Case 1: There exist n € w and a finite set F => F._, with F © I¢_, such that 
O(n) |= ®,(n) le {0, 1}. In this case, set F, = F. 

Case 2: Otherwise. In this case, set F, = Fe_1. 


In any case, we let J. consist of alli > Fe in Iz_; such that R;(e) = 0 if there are 
infinitely many such /, and otherwise we let J. consist of alli > Fe in Ie_; such that 
R;(e) =1, 


It is easy to check that we satisfy properties (1)—-(3) above for every e, and if 
we are in Case 1, also (4). So say we are in Case 2. We claim that for some n, 
there is no finite F > F._; with F C J,_; such that @F (n) J= {0,1}, which again 
gives (4). Suppose otherwise. Then for each n, we can computably search for the 
first F > F, as above, and record the value of ®% (n) as f(n). By assumption, this 
defines a 2-valued function f. Now if ®,(n) le {0,1} for some n then it must be 
that f(n) # ®,,(n), else Case 1 would have applied. Thus, f is DNC. But f is 
computable, a contradiction. oO 


Proposition 3.8.6. SeqCompact,.. omits PA solutions. 


Proof. We construct a computable instance (R; : i € w) of SeqCompact,.. and 
exhibit an f >> @ that computes no solution to this instance. More specifically, let 
T C 2“ be the infinite computable tree whose paths are all the 2-valued DNC 
functions. Our f will be a path through 7, and will satisfy the following requirement 
for all e € w: 


Re: If of has unbounded range then for each b < 2 there are infinitely 
many 7 € w such that of (n) |=i for some n and R;(e) = b. 


Clearly, this will have the desired effect. 

We proceed in stages. For each e, we define numbers ng € wand be < 2, which we 
may periodically redefine throughout the construction. To start, we set ne = be = 0 
for all e. Also, we initially declare all (e,n) € w active. 
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At stage s, assume we have already defined R; [ s for all i < s. We define R;(s) 
for alli < s, along with R, fs +1. Foreach k < s, let Ux,s be the tree of allo ¢ T 
for which it is not the case that ®Y(n) |> ne for any active (e,n) < k. Thus if 
J <k <swehave Ux,; C Uj;,s. Say (e,n) requires attention at stage s if it is active 
and Ue.n),; does not contain strings of length s but each U;,, for k < (e,n) does. 
Soif k < (e,n) ando € Ux,, has length s, then ®Y (n) |= i for somei > ne. By our 
use convention, we have i < s, so in particular R;(e) is already defined. In this case, 
we redefine ne = s, redefine b, to be 0 or | depending as it was | or 0, respectively, 
and declare (e, n) inactive. We then set R;«(e*) = be« for all i*, e* < s where this is 
undefined. 

The trees U;.,, in the construction are all uniformly computable in k and s. From 
this it is evident that (R; : i € w) is computable. Now, note that at most one (e, n) 
can require attention at any given stage. Thus, a pair (e, 7) is either always active or 
it is inactive from some stage onward. Given k, we can thus take s > k large enough 
so that every (e,n) < k active at stage s is active forever. Then for all t > s, we 
have Ux, = Ux,s, and this tree is infinite. Write U, in place of U;.,, for some (any) 
such s. We thus have a nested sequence Up 2 U; 2 - :- of infinite subtrees of T. By 
Corollary 2.8.10, there is thus some f € 2” which is a path through U; for all k. 

Fix e. We claim f, which is also a path through 7, satisfies Re. By construction, if 
some (e, 2) requires attention at a stage s, then every 0 € U(e,n),s of length s satisfies 
®2 (n) |= i for some i > n, and R;(e) = b- for the values of n, and b, current at 
stage s. At that point, n- is redefined to be a larger number, and b is redefined to be 
the opposite bit, and neither of these numbers is redefined again until, if ever, (e, n’) 
requires attention for some other n’. But f is itself an extension of some a € U(e.n),s 
of length s. It follows that for f to fail to satisfy the conclusion of the statement of 
Re, it must be that there is an n such that (e, m) never requires attention (and hence is 
active forever). Then n, reaches a limit value, and by definition, no 7 € Uye,n) can 
satisfy BY (n) |> ne. Since f is a path through U;e,n), this implies of has bounded 
range. This completes the proof. oO 


3.9 Exercises 


Exercise 3.9.1. Prove that there is a computable instance of KL having a single 
noncomputable set as a solution. 


Exercise 3.9.2. Consider the following two problems. 


¢ Instances are all functions f: w — w with bounded range, with a solution to f 
being all k € w such that f is k-bounded. 

¢ Instances are all functions f: w — w, with a solution to f being all k € w such 
that f is k-bounded, or arbitrary if f has unbounded range. 


Show that neither problem uniformly admits computable solutions. 
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Exercise 3.9.3. Find an example of a problem that neither admits nor omits com- 
putable solutions. (It need not be a problem from “ordinary mathematics”.) 


Exercise 3.9.4. Consider RT], the parallelization of RT (Definition 3.1.5). Show 
that RT}, codes the jump. 


Exercise 3.9.5. Find an example of two problems P and Q so that each admits cone 
avoidance but the parallel product, P x Q, does not. 


Exercise 3.9.6. Show that the problem whose instances are pairs (x, y) of distinct 
real numbers, with solutions being 0 or 1 depending as x < y or y < x, does not 
uniformly admit computable solutions. 


Exercise 3.9.7. Show that the problem whose instances are real numbers x, with 
a solution being O or 1 depending as x = 0 or x # O, does not uniformly admit 
computable solutions. 


Exercise 3.9.8 (Hirst [156]). Given an infinite sequence (x; : i € w) of reals, there 
is an infinite sequence (y; : i € w) such that y, = min{x; : i < n} for all n. 
However, in general, there is no computable sequence (j(n) : n € w) such that 
Xj(n) = min{x; : i < n} for all n. 


Exercise 3.9.9. Let Cyy (the choice problem on N) be the problem whose instances 
are all functions e: w X w — 2 as follows. 


¢ e(x,0) =0 for all x € w. 
¢ For each x € w there is at most one s such that e(x, s) # e(x, 5+ 1). 
¢ There is at least one x € w such that e(x, 5s) = 0 for all s. 


For each n 2 2, let C,, (the choice problem on n) be the restriction of Cj to functions 
e: nXw — 2. Show that neither Cy nor C,, uniformly admits computable solutions. 


Exercise 3.9.10. Does there exist a problem P which is not uniformly computably 
true, but such that its parallelization, P, is computably true? 


Chapter 4 ®) 


Check for 


Problem reducibilities updates | 


The approach developed in Section 3.6 provides a means of measuring the complexity 
of an instance-solution problem, but it is too crude in general to adequately compare 
the complexities of two or more problems.As noted earlier, if one problem admits 
solutions in a particular class of sets and another does not, then we may view the 
latter as “harder” from a certain computational standpoint. But it is not obvious how 
to find such a class for a particular pair of problems, or whether such a class even 
exists. It is also unclear what relationship this kind of classification really expresses. 

In this chapter, we develop direct methods of gauging the relative strengths of 
problems. We think of these as reducibilities, or reductions, in that we are defining 
various ways of reducing (the task of solving) one problem to (the task of solv- 
ing) another. In schematic form, such reductions all roughly have the shape shown 
in Figure 4.1. 


problem P problem Q 
instance X ————————-> instance X 
i | 
I | 
I | 
| 1 
+ 


solution Y + go ution Y 


Figure 4.1. Generalized diagram of a reduction between problems. 


Here, X is a P-instance, Xisa Q-instance, Y isa Q-solution to X. , and Y is a P-solution 
Y to X. The dashed lines may be interpreted as going from an instance to a solution 
(within each problem). The solid arrows denote maps or transformations “forth and 
back” between the problems in such a way that we may (in a loose sense) regard the 
diagram as commuting. (At this level of generality, such maps have also been called 
generalized Galois—Tukey connections (Vojta8 [318]) or morphisms (Blass [16].)) 
There are many specific reducibilities fitting this shape. The ones of interest to us here 
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will of course be those arising in computable mathematics and reverse mathematics. 
We will also consider several variations and extensions. 


4.1 Subproblems and identity reducibility 


Arguably, the simplest nontrivial way in which one problem can reduce to another 
is by being a subproblem. 


Definition 4.1.1 (Subproblem). A problem P is a subproblem of a problem Q if 
every P-instance X is also a Q-instance, and every Q-solution to any such X is also 
a P-solution to it. 


For example, consider the following: first, the problem whose instances are all 
odd-degree polynomials with integer coefficients in one variable, with the unique 
solution to any such polynomial being its least zero; and on the other hand, the 
problem whose instances are all such polynomials of degree 3, with the solutions 
being all its zeroes. The second problem is then a subproblem of the first. This fits 
well with our abstract conception of problem reducibility because, intuitively, being 
able to find the least zero of any odd-degree polynomial entails being able to find 
some (any) zero of a third-degree polynomial. 

Note that the second problem above has both fewer instances than the first, as 
well as more solutions to some of these. Among the problems we will encounter, 
this is actually not so common. Typically, a subproblem either restricts the domain 
(and so is a special case), or enlarges some or all of its values (and so is a less 
specific problem), but not both. For example, WKL is a subproblem of the version 
Disj-WKL discussed in Chapter 3. But WKL is also a subproblem of the problem 
whose instances are all infinite binary trees (so, the same as the instances of WKL), 
but with each such tree having as its unique solution the lexicographically least 
(“leftmost”) path. Thus, in the first case WKL is a subproblem only because it has 
fewer instances, and in the other, only because its instances have more solutions. 

The notion of subproblem is certainly natural, but somewhat limiting in that it 
cannot reveal any connections between truly different problems. A first step towards 
a more useful notion of reducibility is given by the following definition. 


Definition 4.1.2 (Identity reducibility). Let P and Q be problems. 


1. P is identity reducible to Q if every P-instance X computes a Q-instance X such 
that every Q-solution to X is a P-solution to X. 

2. P is uniformly identity reducible to Q if there is a Turing functional ® such that 
®(X) is a Q-instance for every P-instance X, and every Q-solution to ®(X) is a 
P-solution to X. 


See Figure 4.2. Instead of asking for the instances of P to be included in those of Q, 
then, here we look at computably transforming the instances of P into instances of Q. 
The particular transformation may depend on the instance in the case of an identity 
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P Q P Q 
computes _ Co) = 

xX ——> xX xX —— x 

I I I I 

is solved by 

is solved by | | is solved by 

is solved by ; 1 

+ ¥ + + 
¥<——=- = ¥ Y <—_._ Y 


Figure 4.2. Diagrams of identity and uniform identity reductions. On the left, X is a P-instance 
computing a Q-instance X such that every Q-solution to X is also a P-solution Y to X. On the 
right, the same situation, but X = ®(X) for a fixed Turing functional ®. 


reduction, but not in the case of a uniform identity reduction. In particular, being a 
subproblem is a uniform identity reduction, witnessed by the identity functional. 


Proposition 4.1.3. The problem whose instances are all partial orders on w, with 
solutions to any such order being all its linear extensions, is uniformly identity 
reducible to WKL. 


Proof. Fix a partial order <p on w. (Recall that formally this is a subset of w such 
that the set of (x, y) with (x, y) € <p forms a partial order. We will write x <p y 
in place of (x, y) € <p for clarity.) Define T to be the set of all o € 2<“@ that are 
consistent with being a linearization of <p. That is, o € T if and only if it satisfies 
the following conditions for all x, y,z € w. 


1. If (x, x) < |o| then o((x,x)) = 1. 

2. If (x, y), (y,x) < |o| and a ((x, y)) = o({y,x)) = 1 thenx = y. 

3. If (x, yy, (y, z), (x, 2) < |e] and o((x, y)) = o(Xy, z)) = 1 then o((x, z)) = 1. 
4. If (x,y) < |o| and x <p y then o((x, y)) = 1. 

5. If (x, y), (y, x) < |o| then either o((x, y)) = 1 or o({y, x)) = 1. 


Then T is uniformly computable from <p, and it is infinite because every partial 
order can be linearized, and every such linearization is a path through 7. Thus, T is 
an instance of WKL. Let L be any solution, i.e., any path through T. By construction, 
L is a linearization of <p. oO 


It may seem peculiar that, in the above proof, we used the existence of lineariza- 
tions to conclude that T is infinite. This is not actually a problem, since our goal is 
not to prove that linearizations exist, but rather, to show that the set of linearizations 
has a particular shape (namely, that it is the set of solutions to a particular instance 
of WKL). All the same, it is worth noting that we could prove T to be infinite directly, 
without the need for any facts about partial orders. To see this, fix n € w. We exhibit 
ao € T of length n. With our usual coding of pairs, this means we need to define « 
on (x, y) for all x, y < n. Fix € < n, and suppose we have defined o on (x, y) for all 
x,y < €. We then set o((€, €)) = 1, and for each x < @, set o((€,x)) = Lif € <p x 
and o ((x, €)) = 1 otherwise. It is easy to check that 0 € T. For another example 
along these lines, see the remarks following Theorem 11.3.1. 
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Proposition 4.1.4. The problem TJ is uniformly identity reducible to \PHP, i.e., the 
parallelization of the infinitary pigeonhole principle. 


Proof. Fix an instance of TJ, which is just an arbitrary set X C w. For eachn € w, 
define cy: w — 2 by 


ie 0 if OX(n)[s] Ff, 
V1 if OX (n) fs] |, 


for all s € w. Then (cy, : n € w) is an instance of IPHP, uniformly computable 
from X. Each cy, is either constantly 0 if ®*(n) 7, or eventually 1 if @X(n) |. 
Hence, if (i, : nm € w) is a solution to (cy, : n € w) then it is unique and equal to the 
characteristic function of X’. As sets and their characteristic functions are identified, 
this completes the proof. oO 


To show that uniform identity reducibility is a strict refinement of identity re- 
ducibility in general, we turn again to WKL and Disj-WKL. As noted above, WKL 
is a subproblem of, and so is uniformly identity reducible to, Disj-WKL. Perhaps 
surprisingly, the identity reduction also holds in reverse. 


Proposition 4.1.5. Disj-WKL is identity reducible to WKL, but not uniformly so. 


Proof. We begin by showing that Disj-WKL is identity reducible to WKL. Fix an 
instance of Disj-WKL. This is a binary tree T that may or may not be infinite. If T is 
infinite, we can simply view it also as an instance U of WKL, and then any solution 
to it as such is also a Disj-WKL-solution. If 7 is finite, let be such that T contains 
no strings of length €. Then, let U be a computable binary tree having as its only 
path the characteristic function of {€}. (Recall that for problem solutions, we identify 
numbers with the singletons containing them.) Clearly, the only WKL-solution to U 
is a Disj-WKL-solution to 7. In both cases we have U <r T, so we have described an 
identity reduction. 

The lack of uniformity above stems from us using different procedures depending 
as T is finite or infinite. To make this explicit, suppose to the contrary that Disj-WKL 
is identity reducible to WKL, say with witnessing Turing functional ®. Let T; = {1” : 
n € w} be the tree consisting of all finite strings of 1s. Then by assumption, ®(71) 
is an infinite binary tree whose only path is 1“. Let € be large enough so that every 
string 0 € ®(7)) of height € satisfies 7(0) = 0 (1) = 1, and let s be large enough so 
that B(7, | s) converges on all strings of length ¢. Take T = T, | s. Then ®(T) is also 
an infinite binary tree, and every path f through ®(7) satisfies f(0) = f(1) = 1. 
But as 7 is finite, any such f must be (the characteristic function of) a singleton, 
which is a contradiction. Oo 


We include one further example, meant mostly to introduce a problem that will 
play an important role in our work in Chapter 8, and to relate it to one we have 
already seen. 


Definition 4.1.6 (Cohesive principle). COH is the problem whose instances are 
sequences R = (R; : i € w) of elements of 2, with solutions to any such sequence 
being all infinite sets S C w such that for each e, either § C* R; or S C* Rj. 
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A set S as above is called cohesive for the family R, or R-cohesive for short. The 
terminology comes from the study of the c.e. sets in computability theory, where a 
set is called p-cohesive if it is cohesive for the family of all primitive recursive sets; 
r-cohesive if it is cohesive for the family of all computable sets; and simply cohesive 
if it is cohesive for the family of all c.e. sets. 


Proposition 4.1.7. Each of COH and SeqCompact. is uniformly identity reducible 
to the other. 


Proof. Given a family of sets (R; : i € w), define a new family (Re :e@ € w) by 
R. (i) = R;(e) for all i,e € w. Now any SeqCompact,..-solution to (Re :e@€ w) 
is a COH-solution to (R; : i € w), and any COH-solution to (Re :e@€w)isa 
SeqCompact,..-solution to (R; : i € w). im 


As the proposition explicates, COH and SeqCompact,.. are really the same principle, 
and each can be obtained from the other just by “turning the instances on their 
sides”. SeqCompact,.. is more obviously a compactness result, whereas COH is 
more convenient when working with combinatorial principles. 


4.2 Computable reducibility 


We now come to one of the main reducibilities between problems, computable 
reducibility. This vastly generalizes identity reducibility, and is certainly one of the 
most direct ways to compare the computational complexity of different problems. It 
does have certain limitations, however, which we will discuss in Section 4.5. 


Definition 4.2.1 (Computable reducibility). Let P and Q be problems. 


1. P is computably reducible to Q, written P <, Q, if every P-instance X computes 
a Q-instance X such that if Y is any Q-solution to X then X ® y computes a 
P-solution to X. 

2. P is computable equivalent to Q, written P =, Q, if PP <QandQ <P 


We represent this visually in Figure 4.3. Basic facts are collected in Proposition 4.2.2. 


Proposition 4.2.2. <. is a transitive relation, and =, is an equivalence relation. 
Moreover, if P and Q are problems and P is identity reducible to a problem Q then 
P<Q 


Intuitively, P <. Q means that the problem P can be (effectively) coded into the 
problem Q, and this is how the situation is often described. We have seen some 
examples of this already in Section 3.6, with problems that code the jump or code 
PA. Computable reducibility allows us to recast some of these complexity notions, 
and better understand some of the results employing them. 


Proposition 4.2.3. A problem P admits computable solutions if and only if P <¢ \dow. 
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Figure 4.3. Diagram of a computable reduction: X is a P-instance computing a Q-instance X:Y 
is a Q-solution to X so that X © Y computes a P-solution Y to X. 


Proposition 4.2.4. A problem P codes the jump if and only if TJ << P. 


Proposition 4.2.5. Let P be a problem. 


1. P admits solutions in PA if and only if P <_ WKL. 
2. P codes PA if and only if WKL <, P. 


Proof. Part (1) is basically the definition. We prove (2). By definition, and using the 
existence of a subtree of 2“ all of whose paths have PA degree, WKL is computably 
reducible to P if and only if for every A € w, there is an A-computable instance of 
P whose every solution Y satisfies A ®@ Y > A. oO 


Recall the problems Disj-HBT/o,;; and COH given by Definitions 3.3.4 and 4.1.6, 
respectively. Combining Proposition 4.2.5 with earlier results yields the following. 


Corollary 4.2.6. Disj-HBT {0,1} = WKL. 

Proof. By Propositions 3.8.1 and 3.8.3. oO 
Corollary 4.2.7. COH <. WKL and WKL ¢, COH. 

Proof. By Propositions 3.8.5, 3.8.6 and 4.1.7. oO 
Corollary 4.2.8. KL <. WKL. 

Proof. By Proposition 3.6.10. oO 


And while we are discussing KL and Proposition 3.6.10, let us note that using 
computable reducibility, the latter can actually be improved to an equivalence. Thus, 
as far as their computability theoretic complexity, “solving K6nig’s lemma’ and 
“solving the halting problem” are the same. 
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Proposition 4.2.9. KL =, Tu. 


Proof. By Proposition 3.6.10, we only need to show KL <_ TJ. Let T € w<@ be an 
infinite, finitely branching tree T. Define p: wS® — w as follows: 


0 if a ¢ Ext(7), 
p(@) = ie 
(ui)[ai € Ext(T)] if a € Ext(T). 


By K@6nig’s lemma, p is total, and by relativizing Proposition 2.8.5, p <r T’. Now 
define a = (), and for i € w, aj4; = a; ~ p(aj). Then a < a; < --: € T, and 
f = Uiew % is an infinite path through T. We have f <7 (a; :i € w) <7 p <7 T’.O 


At this point, we may believe that any two problems that admit computable 
solutions are computably equivalent. But consider, on the one hand, the problem 
whose instances are all sets, and on the other, the problem whose instances are 
all sets computing @’, with each instance of either problem having itself as its 
unique solution. Both problems admit computable solutions, but the former is not 
computably reducible to the latter since it has instances (e.g., the computable sets) 
that compute no instance of the latter. Once again this is a bit of an artificial situation, 
however. Natural problems do have A-computable instances for all A € w, and for 
these the above intuition is correct. In particular, we have the following. 


Proposition 4.2.10. The following are all computably equivalent. 


2. RT. 

2. IPHP. 

3. General-IPHP. 
4. FUF. 


We will look at these principles again with a finer lens in the next section. Specifically, 
this result should be compared with Proposition 4.3.4. 

We next prove that the converse of Proposition 4.1.4 fails even under computable 
reducibility. Thus, from a computational standpoint, IPHP is amore complex problem 
than that of finding the Turing jump of a set. 


Proposition 4.2.11. IPHP < Tu. 


Proof. We build a computable instance (ce : e € w) of IPHP with no @’-computable 
solution. More precisely, each c, will be a map w — 2, and we will ensure that for 
each e € w, if ©?’ (e) |= i for some i < 2 then ce(s) = 1 —i for almost all s. An 
IPHP-solution to (ce : € € w) isan X € 2% with X(e) =i only if ce(s) = i for 
infinitely many s. So this construction ensures that Oo? fails to be a solution to our 
IPHP-instance for every eé. 

Each c, is defined uniformly computably in e, which ensures that (c, : e € w) is 
computable. To define c,(s) for some s, run the computation 2’ (e) [s], and if this 
converges to some i < 2 let ce(s) = 1 — i. (Note that ®? (e) [s] is run for at most s 
many steps, so checking this convergence is computable.) Now if 2 (e) J= i then 
there is an sg so that ©? (e) [s] =i for all s > so, and therefore ce(s) = 1 —i for all 
such s, as desired. oO 
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Figure 4.4. Diagram of a Weihrauch reduction: X is a P-instance computing a Q-instance X, and 
Y is a Q-solution to X so that X © Y computes a P-solution Y to X. 


In the above proof, it is worth noting that if Oo” (e) T then 2 '(e) [s] may converge 
to 0 and | for infinitely many s each, so c, will not be eventually constant. And 
indeed, we cannot modify the proof so that ce is eventually constant for every e (see 
Exercise 4.8.3). 


4.3 Weihrauch reducibility 


The second main motion of problem reducibility we consider in this book is to 
computable reducibility as uniform identity reducibility is to identity reducible. That 
is, it is its uniform analogue. The origins of this reducibility are in computable 
analysis, going back to the work of Weihrauch [323]. Later, it was noticed by 
Gherardi and Marcone [122], and independently by Dorais, Dzhafarov, Hirst, Mileti, 
and Shafer [72], that this notion can also be a fruitful way of comparing problems 
that arise in the context of reverse mathematics. 


Definition 4.3.1 (Weihrauch reducibility). Let P and Q be problems. 


1. Pis Weihrauch reducible to Q, written P <w Q, if there exist Turing functionals 
® and ¥ satisfying the following: if X is any P-instance then ®(X) is a Q- 
instance, and if Y is any Q-solution to ®(X) then V(X @ Y) is a P-solution 
to X. 

2. P is Weihrauch equivalent to Q, written P =w Q, if P <w Q and Q <w P. 


See Figure 4.4. 


Proposition 4.3.2. <w is a transitive relations, and =w is an equivalence relation. 
Moreover, let P and Q be problems. 


1. If P is uniformly identity reducible Q then P <w Q. 
2. IfP <w QthenP <Q. 


Proposition 4.3.3. A problem P uniformly admits computable solutions if and only 
if P <w Idow. 
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If P <w Q then the instances of P should not split apart in some noncomputable 
way that affects how they compute instances of Q. Similarly, the solutions of Q should 
not split apart in such a fashion either. A Weihrauch reduction from one problem to 
another thus conveys a much closer relationship than a computable reduction alone. 
As aresult, it is able to tease out subtler distinctions than computable reducibility. 

We begin with the following analogue of Proposition 4.2.10. Due to the increased 
uniformity, this is no longer a completely obvious result. 


Proposition 4.3.4. The following are all Weihrauch equivalent. 


iBT, 
2. IPHP. 
3. General-IPHP. 


Proof. We show RT! <w IPHP <w General-IPHP <w RT!. 

First, fix an instance of RT!, which is a map f: w — w with bounded range. Then 
f is also an instance of IPHP. If i € w is any IPHP-solution to f then {x : f(x) =i} 
is uniformly computable from f and i, and is an RT!-solution to f. 

That IPHP <w General-IPHP is immediate since IPHP is a subproblem of 
General-IPHP. 

So now fix an instance of General-IPHP. This is a sequence (X; : i € w) of sets 
with infinite union and X; = @ for almost all 7. Define f: w — w as follows: given 
x € w, search for the least i € w such that X; contains some y > x, and let f(x) =1. 
Clearly f is uniformly computable from (X; : i € w), and since only finitely many 
of the X; are nonempty, f has bounded range. Thus, f is an RT!-instance, and if H 
is any solution to it then f(min A) is a General-IPHP-solution to (X;:i¢€w). a 


Notably absent above is FUF. Indeed, whereas FUF was equivalent to the three above 
under <¢, it is strictly weaker under <w. 


Proposition 4.3.5. FUF <w RT! but even RT, <w FUF. 


Proof. Seeking a contradiction, suppose ® and witness that RT} <w FUF as in 
Definition 4.3.1. We construct an instance f: w — 2. By assumption, ®(f) is an 
instance of FUF, meaning a family (F; : 7 € w) of finite sets with F; = @ for almost 
alli. As f € 2®, initial segments of it are elements of 2<@. For 0 € 2“, say o puts 
x € F; if ®(o) (i, x)) [= 1. In this case, if o < f we will indeed have x € F;. 

There must exist aa € 2<® and ann € w such that no t > o puts any x in 
any F; with i > n, and no x > n in any F; with? < n. If not, we could define a 
sequence T < tT < --: such that for each n, t, puts some x in some F; with i > n 
or some x > n in some F; withi < n. But then, if we let f = U;<,, ti, the family 
®(f) = (F; : i € w) will either satisfy F; # @ for infinitely many /, or else some F; 
will be infinite, which is impossible by our assumption on ®. 

It follows that for every f > o, the FUF-instance ®(/f) has n as a solution. By 
assumption on ¥, then, ¥(f @ {n}) is an RT}-solution to any such f. We can thus 
find t > o so that V(t ® {n}) |= {i} for some i < 2. But now let f =7~(1 -1)®. 
That is, f is the extension of t with f(x) = 1 —7 for all x > |t|. Clearly, 7 is not an 
RT5-solution to this f even though Y(f ® {n}) =i, which is a contradiction. Oo 
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Weihrauch reducibility can elucidate further subtle differences between different 
formulations the same problem. Recall from Section 3.2 the version of IPHP whose 
instances, rather than being maps w — w with bounded range, are pairs (k, f), 
where k € w and f: w — k. Let us denote this variant by IPHP,,. It is easy to see 
that IPHP, =, IPHP. Also, IPHP,, is trivially Weihrauch (in fact, uniformly identity) 
reducible to IPHP. 


Proposition 4.3.6. IPHP <w IPHP,. 


Proof. Suppose ® and are Turing functionals witnessing that IPHP <w IPHP,. 
We construct an instance g: w — w of IPHP. Initial segments of g are thus strings 
a € w<”. By assumption on ®, we can fix some such a such that B(@)(2k) |= 1 
for some k. (Note that the instances of IPHP, are formally pairs (k, f) = {k} @ f.) 
Thus, for any IPHP-instance g > a, the IPHP,-instance ®(g) will have some i < k 
as a solution. By assumption on 'P, we may thus fix a’ > a such that for each i < k, 
either V(a’ ® {i}) | or P(B @ {i}) T for all B = a’. Let b be the largest value of 
W(a’ @ {i}) fori < k. Then, define g = a’ ~(b+1)®. So g(x) = b +1 for almost all 
x, hence g has bounded range and is thus an instance of IPHP. Since g extends a’ we 
have ®(g) = (k, f) for some f and Y(g @ {i}) < b for all i < k. Because there are 
not infinitely many x with g(x) < b, this means Y(g @ {i}) is not an IPHP-solution 
to g for any i < k, a contradiction. oO 


A related example concerns Ramsey’s theorem, RT, for different values of k. 
Consider first the case n = 1. Clearly, if 7 < k then RT F is asubproblem of RT be Since 


both principles also admit computable solutions, we also have that RT; XK RTI, so 
the two are computably equivalent. As it turns out, this cannot be improved to a 
Weihrauch reduction. 


Proposition 4.3.7 (Dorais, Dzhafarov, Hirst, Mileti, and Shafer [72]; Hirschfeldt 
and Jockusch [148]; Brattka and Rakotoniaina[20]). For allk > j, RT; €w RT - 


The proof is Exercise 4.8.4. In Section 4.5, we will define a notion of “applying 
a problem multiple times” and, among other results, establish that RT; can be 
uniformly reduced to multiple applications of RT}. That argument helps elucidate 


the key obstacle to uniformly reducing RT : to (a single application of) RT ‘ as above. 
In Theorem 9.1.11, we will also see a higher dimensional analogue of this result: 
that ifn > 2 and k > j, then RT; Le RT’. 

Interestingly, Weihrauch reducibility also captures many of the nuances of con- 
structive/intuitionistic mathematics. For example, part of the relationship between 
WKL and the Brouwer fan theorem, FAN, is reflected in the following result. 


Proposition 4.3.8. FAN <w WKL but WKL ¢, FAN. 


Proof. Fix an instance of FAN, i.e., a bar B C 2“. Let T be the set of all a € 2“ as 
follows: for each n < |o|, if every o € 2” has an initial segment in B then o(n) = 1, 
and otherwise o-(n) = 0. Then T is an infinite tree, uniformly computable from B. 
Clearly, T has a single path, f € 2”, that is either equal to 0® or to0"1° for some n. 
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We claim the former cannot happen. For otherwise, the set of all 7 € 2<® such that 
o has no initial segment in B would be an infinite binary tree, and any path through 
it would thus contradict the fact that B is a bar. So let n be such that f = 0"1°. Then 
the set B’ of all initial segments in B of the strings o € 2” is a finite bar. Note that 
n can be found uniformly computably from f, and so B’ is uniformly computable 


from B @ f. 
To show that WKL <. FAN we just note that WKL has a computable instance with 
no computable solutions, whereas FAN admits computable solutions. oO 


We include one more example related to WKL, which further underscores the 
difference between bounded and unbounded strings and trees. As with RCAo, the 
“R” in DNR is due to tradition, apart from which the system could be called DNC. 


Definition 4.3.9 (Diagonally noncomputable problem). 


1. DNR is the problem whose instances are all sets X € 2, with the solutions to 
any such X being all functions f: w — w that are DNC relative to X. 

2. For each k > 2, DNR, is the problem whose instances are all sets X € 2%, with 
the solutions to any such X being all k-valued functions f: w — w that are 
DNC relative to X. 


The following basic facts are left to the reader. 


Proposition 4.3.10. 


7. WKL =w DNR2>. 
2. Forall j,k > 2, DNR; =. DNRx. 


One half of part (2) follows simply from the fact that if 7 < k then DNR, is a 
subproblem of DNR; (and so DNRx <w DNR;). Surprisingly, the other half of the 
equivalence cannot be similarly strengthened. 


Theorem 4.3.11 (Jockusch [170]). For all k > j, DNR; 4w DNR«x. 


Proof. Suppose otherwise, as witnessed by functionals ® and ¥. Given X <r Y, any 
function which is DNC relative to Y is also DNC relative to X. Thus, we may assume 
that ® is the identity functional. Taking @ as a DNR; instance, it thus suffices to 
exhibit a k-valued DNC function f such that ¥(f) is not a j-valued DNC function. 
Let T be the set of all o € k<® such that for all e < |o|, if ®.(e)[|o|] | then 
o(e) # ®-(e). Then T is a computable tree, and the paths through T are precisely 
the k-valued DNC functions. 

Fix x € w. Since T is bounded, it follows by assumption on there is an f € w 
such that ¥(o)(x) |< j for all o € T with |o| = €. Let €, be the least such ¢. For 
every y < j, letT, y ={o0 €T: |o| = €, A ¥(o7)(x) |= y}, so that 


{0 €T: |o| = &} = UyejTry- (4.1) 


Let D = {e < €, : ®.(e)[|€|] T}, and call 7, large if for every t € k<® with 
|t| = €, there is ao € Ty, such that o(e) # t(e) for all e € D. 
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We claim that there is a y < j such that T,y is large. Suppose not. Then for each y 
we can fix ty € k< such that |r| = €, and every o € T,,y satisfies 7(e) = Ty (e) for 
some e € D. Now, we define a string 0 € k<“ of length @,. as follows. Fix e < ¢€,. If 
e € D then ®,(e) |, and we let o-(e) have any value different from this computation. 
If e € D, then we let o(e) be any i < k such that t,(e) # i for all y < j, which 
exists because j < k. But theno € T ando ¢ T,y for any y < j, which contradicts 
(4.1). So the claim holds. 

Next, we claim that if T\.,, is large then there exists 0 € Ty, with a(e) # ®e(e) 
for all e < €,. Indeed, let t(e) for each e < €, be ®.(e) if the latter converges 
and 0 otherwise. By definition of T, every o € T, , satisfies that o(e) # t(e) for 
alle ¢ D. If T,, is large, then there is ao € T,,, that additionally satisfies that 
o(e) # T(e) for all e € D, which proves the claim. 

To complete the proof, note that the level , is uniformly computable from x, 
and that determining whether or not 7), is large is uniformly computable from x 
and y < j. Thus, by the first claim above, there is a computable function g such 
that for every x, g(x) = y for the least y < j such that 7, ,, is large. Let h be the 
computable function defined by setting ©; (,)(z) = g(x) for all z, and apply the 
recursion theorem (Theorem 2.4.11) to get an e such that ®y(-) = ®,. By the second 
claim, there is a k-valued DNC function f having an initial segment in T, ¢(¢). But 
now this function satisfies 


¥(f)(e) L= gle) = Pniey(e) = Bele), 


so ¥(f) is not DNC. oO 


4.4 Strong forms 


An important but somewhat subtle detail in both computable reducibility and 
Weihrauch reducibility concerns access to the original instances. Say P is being 
reduced to Q, and we are given a P-instance X. Then there is a Q-instance X com- 
putable from X (uniformly, in the case of Weihrauch reducibility) every solution to 
which, together with X, computes a P-solution to X. The access to X as an oracle is 
used in many such reductions as part of a kind of “post-processing”. 

For example, consider the reduction IPHP <w RT!. An IPHP-instance c is viewed 
as an RT!-instance, and from an RT!-solution H we uniformly compute an IPHP- 
solution by applying c to an element of H. That is, c tells us what color it takes 
on H, and this is the IPHP-solution. But without access to c, there is no apparent 
way to determine this information, and the reduction breaks down. Does a different 
reduction exist that avoids needing to consult c at the end? We will see below that 
the answer is no. 

Denying access to the original instance is thus an additional resource constraint, 
and determining when such access is essential helps further elucidate how much 
information is necessary to reduce one problem to another. We can measure when 
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Figure 4.5. Diagrams of strong computable and strong Weihrauch reductions. 


access to the original P-instance X is indeed essential using the following strong 
forms of computable and Weihrauch reducibilities. 


Definition 4.4.1 (Strong reducibility forms). Let P and Q be problems. 


1. P is strongly computably reducible to Q, written P <,- Q, if every P-instance X 
computes a Q-instance X such that every Q-solution YtoX computes a solution 
Ytox. 

2. P is strongly computably equivalent to Q, written P =s¢ Q, if P <s- Q and 
Q Xe P. 

3. P is strongly Weihrauch reducible to Q, written P <,w Q, if there exist Turing 
functionals ® and ¥ satisfying the following: if X is any P-instance then ®(X) 
is a Q-instance, and if Y is any Q-solution to ®(X) then w(Y) is a P-solution 
to X. 

4. P is strongly Weihrauch equivalent to Q, written P =,w Q, if both P <,w Q and 
Q <sw P. 


The next proposition establishes the key properties of <,, and <,w. 


Proposition 4.4.2. <x. and <,w are transitive relations, and =s. and =3w are equiv- 
alence relations. Moreover, let P and Q be problems. 


1. If P is identity reducible to Q then P <s- Q. 

2. If P is uniformly identity reducible to Q then P <sw Q. 
3. If P <sw Q then P <gc. Q. 

4. If P <sc QthenP <Q. 

5. If PP <sw Q then P <w Q. 


Identity and uniform identity reducibilities are of course the extreme examples of 
strong reductions, since they involve no “post-processing” at all. 

We leave to Exercise 4.8.6 the following strong analogue of a part of Propo- 
sition 4.3.4. However, as the subsequent result shows, we cannot obtain a strong 
analogue of the full proposition. 


Proposition 4.4.3. General-IPHP =,w IPHP. 
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Proposition 4.4.4. RT} <sw IPHP and IPHP ¢sw RT’. 


Proof. Assume RT} <sw IPHP via ® and . Since binary strings are initial segments 
of RT}-instances, @® must map these to initial segments of IPHP-instances. Since 
IPHP-instances have bounded range, we can fix aa € 2“ andak € w so that there 
isnox ort > o for which ®(7)(x) |> k. Then for any RT5-instance c extending a, 
the IPHP-instance ®(c) will have somei < k as a solution, and hence ¥({7}) will be 
an infinite set. Define c as follows. Let c(x) = a(x) for all x < |o|. For each i < k 
such that the set ‘¥({i}) is large enough, choose the least x > |c| in this set and set 
c(x) = 0. Then, let c(x) = 1 for all other x. Thus, all RT}-solutions to c consist of 
elements colored | by c. But if i < k is any solution to ®(c), then ‘Y({i}) contains 
an element colored 0 by c, hence 'P({i}) is not a solution after all. 

Next, assume IPHP <,w RT! via ® and . We define an instance g of IPHP as in 
the proof of Proposition 4.3.6. First, we can fix a € w<® and k € w such that for all 
B = a and all x, if O(B)(x) |=i then i < k. Thus, for any g = a, ®(g) will be an 
instance c of RTj. and any infinite homogeneous set for c with colori < k will be 
mapped by ¥ to an IPHP-solution to c. We can now fix @’ > @ such that one of the 
following alternatives for each i < k. 


1. There exists a finite set F so that ®(a’)(x) |= i for each x € F and such that 
W(F)(J) l= 1 for some j. 
2. There exists a b € w so that no x > D satisfies B(B)(x) |= i for some B = a’. 


Let k be the maximum of all the j obtained under Case 1. Define g: w > k +2 
by g(x) = a’(x) for all x < |a’|, and a(x) = k +1 for all other x. Thus, the only 
IPHP-solution to g is k + 1. But if H is any RT!-solution to ®(g), then ¥(H) = j 
for some j < k + 1, a contradiction. oO 


The preceding result illustrates an interesting subtlety concerning computable 
instances. If X is computable and Z is any set, then anything computable from 
X @ Z is also computable from Z alone. We might thus expect that if we restrict to 
computable instances, the strong versions of our reducibilities behave the same way as 
the originals. And indeed, this is the case for computable reducibility, in the following 
sense: if P and Q are problems and acomputable P-instance X witnesses that P sc Q, 
then the same instance also witnesses that P ¢. Q. But this fails for Weihrauch 
reducibility: the RT}-instance c we built above to witness that RT} sw IPHP was 
computable, and yet we still have RT} <w IPHP. 

A closer look at the proof reveals the reason. By first finding the initial segment 7 
of c and determining the bound k, we were able to ask ¥ to produce elements on which 
our coloring was not yet defined. We could then change the colors appropriately to 
diagonalize. Had Y had access to c, it could have easily avoided this situation, 
for example by never outputting any number before c colored it. In the proof, c 
is computable, but not uniformly computable. The initial segment o requires a @’ 
oracle to find. As it turns out, this is in general the only obstacle to moving from <,w 
to <w. 
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Proposition 4.4.5. Let P and Q be problems. Suppose that for each pair of Turing 
functionals ® and © there is a P-instance X@,w that is uniformly computable in 
(indices for) ® and ¥ and that witnesses that P is not strongly Weihrauch reducible 
to Q via these functionals. Then P €w Q. 


Proof. Let f be a computable function such that for all e,i € w, f(e,7) is an index 
for X@,,0,. Now fix Turing functionals ®, and ®,,. Define a computable function g 
such that for all 7 € w and all oracles Z, 


® (i) (Z) = On (PF (ni) ® Z). 


By the recursion theorem, take a fixed point i for g, so ®g(j)(Z) = ®,(Z) for all 
oracles Z. Then we have 


On (Xo,,0; BZ) = Pm (PF (n,7) @ Z) = Pai) (Z) = P;(Z). 


for all Z. Now since X@,,,o, Witnesses that P is not strongly Weihrauch reducible to 
Q via ®,, and ®;, it also witnesses that P is not Weihrauch reducible to Q via ®,, 
and ®,,. oO 


We conclude by mentioning, without proof for now, an example of a nonreduction 
under <¢c. The following is analogous to Proposition 4.3.7, where we saw it for <w. 


Proposition 4.4.6. For all k > j, RT), €sc RT}. 


The proposition can be proved by a direct stage-by-stage construction, but in that 
form it is somewhat cumbersome. (The enthusiastic reader is encouraged to try it!) A 
cleaner presentation is as a forcing construction. We introduce forcing in Chapter 7, 
and give a proof of the above proposition (along with generalizations) in Section 9.1. 


4.5 Multiple applications 


Computable reducibility seems to be a good measure of the computational complex- 
ity of a problem, but it has a particular limitation. A computable reduction measures 
the complexity of a problem’s one time use. To explain, consider again Proposi- 
tion 4.2.11, that IPHP <¢ TJ. Knowing how to find the Turing jump of a given set 
is not enough to find solutions to arbitrary instances of IPHP. But knowing how to 
find the double jump certainly is. And intuitively, if we have a method for finding X’ 
from X, then we ought to be able to obtain X”’ by appealing to this method twice. 

A different measure of a problem’s overall computational strength is therefore 
one that allows us to “apply” problems multiple times in a reduction. The following 
generalization of computable reducibility is a step in this direction. 
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Definition 4.5.1 (Computable reducibility to multiple applications). Let P and 
Q be problems. For m > 1, P is computably reducible to m applications of Q if: 


* every P-instance X computes a Q-instance Xn. 

¢ foreachi < m-1, if Y; is any Q-solution to the Q-instance ee then X@He- : oY, 
computes a Q-instance Mis 

eif V4 is any Q-solution to the Q-instance Xon-is then X @ Y @:-:-@ V4 
computes a P-solution Y to X. 


The following basic facts are left to Exercise 4.8.7. 


Proposition 4.5.2. Let P, Q, and R be problems. 


1. P <_ Qifand only if P is computably reducible to one application of Q. 

2. For l,m > 1, if P is computably reducible to m applications of Q, and Q is 
computably reducible to | applications of R, then P is computably reducible to 
m - 1 applications of R. 


Now the objection raised above is alleviated. 
Proposition 4.5.3. IPHP is computably reducible to two applications of TJ. 


Proof. Fix an instance X = (c; : i € w) of IPHP, so that each c; is amap w > w 
with bounded range. We pass to X itself, viewed as an instance of TJ. The only 
TJ-solution to this is X’, and given it, we pass to it as another instance of TJ. Now 
the only TJ-solution is X”’, from which we can compute a solution to X: for each i, 
X”’ can uniformly compute the least j = j; such that c;(x) = / for infinitely many x. 
Then (j; : i € w) is an X”’-computable IPHP-solution to X. Oo 


The preceding result could be easily expressed without any appeal to problem re- 
ducibilities. Indeed, as stated it is merely a basic computability theoretic observation 
in different, and arguably more convoluted, parlance. We will see shortly that this 
is not always the case. And in Chapter 9 we will encounter examples that are very 
difficult to express otherwise. 

For completeness, we mention also the following basic fact, which basically says 
that Definition 4.5.1 can be “padded”’. In particular, we have that being computably 
reducible to m applications of a problem is the same as being computable reducible 
to at most m applications. 


Proposition 4.5.4. Let P and Q be problems. If P is computably reducible to m 
applications of Q, then P is computably reducible to | applications of Q for every 
l>m. 


Proof. Suppose we have a P-instance X, along with Q- instances Xo, «. busts ae with 
solutions Vp, sat Yue 1, as in Definition 4.5.1. Thus, X @ Y OG: @ V3 1 computes 
a P-solution to X. Now for each i with m< <i <1, define X; = z,. I: For sa choice 
of Q-solutions Vu hes Ye4 to i: = en 1, we then have that X @ Yo @::-@ Y,_ 1 
computes a P-solution. oO 
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An interesting problem in this context is WKL. As it turns out, it always suffices 
just to apply it once. We will see this in a different guise also in Chapter 5. 


Proposition 4.5.5. [fa problem P is computably reducible to m applications of WKL 
for any m > 1, then P <_ WKL. 


Proof. Fix a P-instance X and any Y >> X. We claim Y computes a P-solution to 
X, whence the conclusion of the proposition follows. Fix a WKL-instance Ty <7 X, 
as in Definition 4.5.1. Then in particular, Y >> 7T>. Now suppose that for some 
i < m, we have defined WKL-instances 7o,...,7; and for each j < i, a path f; 
through 7; such that T;,; <p X © fo @--- @ f;. Ifi > 0, we assume inductively that 
Y>X@®fo®--:@ fj-1. So Y > T;, and by Proposition 2.8.26, there is a path fj 
through 7; such that Y > X @ fo ®--- @ fj. In this way, we find a P-solution to X 
computable in X ® fo ®--- ® fi,-1 and hence, by construction, in Y. oO 


Corollary 4.5.6. Neither COH nor KL is computably reducible to m applications of 
WKL, for any m > 1. 


Once again, we can also consider uniform versions. 


Definition 4.5.7 (Weihrauch reducibility to multiple applications). Let P and Q 
be problems. For m > 1, P is Weihrauch reducible to m applications of Q if there is 
a Turing functional A satisfying the following. 


* If X is a P-instance then A(0, X) is a Q-instance Xo. 2 
¢ For eachi < m — 1, if Y; is any Q-solution to the Q-instance Xj, then 


A(i+1,X,¥%,...5¥;) 


is a Q-instance ene 
e If Y,,-1 is any Q-solution to the Q-instance X,,_;, then 


A(m, X, Y%, ae ED Yn-1) 
is a P-solution Y to X. 


Thus, this is precisely the same as Definition 4.5.1, only the computations at each step 
are uniform. Naturally, we have the analogue of Proposition 4.5.2 (see Exercise 4.8.7). 


Proposition 4.5.8. Let P, Q, and R be problems. 


1. P <w Q ifand only if P is Weihrauch reducible to one application of Q. 

2. If P and Q are problems and P is Weihrauch reducible to m applications of Q 
then P is computably reducible to m applications of Q. 

3. IfP is Weihrauch reducible to m applications of Q, then P is Weihrauch reducible 
to | applications of Q for every | > m. 

4. For l,m > 1, if Q is Weihrauch reducible to | applications of R and P is 
Weihrauch reducible to m applications of Q, then P is Weihrauch reducible to 
1 -m applications of R. 
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We illustrate this notion with Ramsey’s theorem for different numbers of colors. 


Proposition 4.5.9 (Dorais, Dzhafarov, Hirst, Mileti, and Shafer [72]). For all 
n>1landj >2andm 21, RT Fin is Weihrauch reducible to m applications of RT}. 


Proof. The proof is by induction on m. If m = 1, the result is clear. Assume the 
result is true for m > 1, and consider an instance c of RT’ We define an instance 


m+ * 


do of RT} (and hence of RT}, since j < j”") by letting 
do(%) = (uk < f)Lj"k < c(%) < j™(k+ 0] 


for all x € [w]". Then dp is uniformly computable from c. Given any infinite 
homogeneous set Hp = {ho < h, < ---} for do, say with color k < j, we define an 
instance d of RT im by letting 


d\(xo, ar sie »Xn-1) = C(hxo, msieny hx, 1) = yk 


for all (xo, ...,Xn-1) € [w]”. Then d, is uniformly computable from c @ H,. We 
now apply the inductive hypothesis to d;. So for | < i < m, given an infinite 
homogeneous set H; for the RT‘ instance d;, we obtain another such instance d;+1 
uniformly computable in c ® H; ® --- ® H;. And given an infinite homogeneous 
set H,, for the RT jm-instance dm we obtain an infinite homogeneous set H for d, 
uniformly computable in c @ H; ®©-:-@ H,,. Then {h, : x € H} is an infinite 
homogeneous set for c. oO 


Corollary 4.5.10. For alln > | and all k > j > 2, there is an m so that RT; is 
Weihrauch reducible to m applications of RT’. 


Proof. Fix m so that j’” > k. o 


We use this opportunity to introduce an important operator on problems from the 
computable analysis literature. Intuitively, the compositional product defined below 
represents the problem of applying one problem and then a second, in sequence. 


Definition 4.5.11 (Compositional product). Let P and Q be problems. The com- 
positional product of Q with P, written Q « P, is the problem whose instances are 
pairs (X,I) satisfying the following. 


e X is a P-instance. 
¢ Tis a Turing functional. 
¢ If Y is any P-solution to X then I’(X, Y) is a Q-instance. 


The solutions to any such (X,I) are all pairs (Y, Y), where Y is a P-solution to X 
and Y is a Q-solution to '(X, Y). 


It can be shown that * is associative (Exercise 4.8.8). We can recast Definition 4.5.7 
as follows: P is Weihrauch reducible to m > 1 applications of Q if P is Weihrauch 
reducible to the m-fold compositional product of Q with itself. But the compo- 
sitional product is more expressive. For example, it is evident from our proof of 
Proposition 4.5.9 that if k =ij then RT; <w RT? * RT. Another observation is the 
following. 


4.6 w-model reducibility 95 
Proposition 4.5.12. For all n,k > 2, RT} <w RTU! * TJ * Ty. 


Proof. We first examine the classic inductive proof of Ramsey’s theorem. Fix an 
instance c of RT; We define an infinite set {xo, x1,...} and a sequence of infinite 
sets Ro D2 Rj D--- satisfying the following for all s € w. 


1. Xs <Xs41- 
2. Xs < Rs. 
3. For all ¥ € [{xo,...,x5}]"71, the color c(x, y) is the same for all y € Rs. 


Of course, part (2) is vacuous if s < n—2. So for s < n—2, we simply let x, = s and 
Rs, ={y €w: y > n-2}. Now fix s > n — 2, and suppose we have defined x; and 
R, for all t < s. Let x, = Rs—1, and enumerate the elements of [{xo,...,xs}]"~! as 
X1,...,X,. Let R° = R,_), and given R' for some i < J, choose the least color j so that 
c(Xi+1,y) = j for infinitely many y € R’ and set R*! = {y € R’ : c(Xi1,y) = J}- 
Finally, let R; = R’. It is easily seen that this ensures the desired properties hold. 
We next define an instance d of Rigs. For each (s0,...,5n-2) € [w]”7!, let 


d(so, he ens) Sn-2) = C(Xso, eras Kg xy NE) 


for some (any) s > sy,_-2. By property (2) above, this uniquely defined. Now if H is 
any infinite homogeneous set for d, it is clear that {x, : s € H} is a homogeneous 
set for c. 

Observe that d and the set {xo,x1,...} are uniformly computable from c’’, and 
the homogeneous set {x,; : s € H} is uniformly computable from {xo, x1, ...} and 
H, hence from c” @ H. Thus, RT; can be Weihrauch reduced to RT! *« TJ *« TJ as 
follows. First, the given RT;-instance c is passed as a TJ-instance, yielding c’ since 
this is the only solution. Next, c’ is passed as a TJ-instance, yielding c’’. Now, from 
c”’ we uniformly compute the RT? ?-instance d, and from c” and any RTz ?-solution 
H to d we compute an RT7-solution to c, as needed. oO 
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The major limitation of computably reducing to m applications is that the number 
m is fixed. Consider problems P and Q, with the instances of P partitioned into 
(finitely or infinitely many) sets Jp, /;,.... For each i, let P; be the restriction of P 
to J;, and suppose P; is computably reducible to m; applications of Q for some m;. 
If we are only looking at finitely many /, then P itself is computably reducible to 
max{mo, m1,...} applications of Q. But if there are infinitely many 7, and it happens 
for example that mo < m, < ---, then there is no reason P should to be computably 
reducible to m applications of Q for any m. And yet, each individual instance of P 
still requires only “finitely many applications of Q” to solve. 

There is thus a disconnect between the intuitive and formal notions here. In 
the example above, the latter seems only marginally better than just (ordinary) 
computable reducibility at describing the relationship between P and Q. 
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Rather than further elaborating on our previous reducibilities, we find the right 
formalism by starting from scratch. We would like to say that P is reducible to Q if 
having the computational resources to solve every given instance of Q implies the 
same for every given instance of P. What should having computational resources of 
a particular sort mean? We take a model theoretic view. 


Definition 4.6.1 (w-model). An w-model is a nonempty collection S C 2 closed 
downward under <jz and ®. 


We regard an w-model as representing a universe of sets that we, with our com- 
putational resources, know how to find. On this view, we can think of being able 
to “solve” a problem as a closure condition: our universe should be closed under 
finding solutions to instances of Q. 


Definition 4.6.2. An w-model S satisfies a problem P, written S & P, if for every 
P-instance X € S there is at least one solution Y € S. 


Let us illustrate this concept with a familiar example. 


Proposition 4.6.3. Let S be an w-model. Then S = WKL if and only if for every 
A €S there exists an X € S such that X > A. 


Proof. First, suppose S & WKL and fix A € S. By Proposition 2.8.14, relativized 
to A, there exists an infinite A-computable tree each of whose paths has PA degree 
relative to A. Since w-models are closed under <7, this tree belongs to S. Since 
S & WKL, some path through this tree belongs to S. Hence, S contains a set of PA 
degree relative to A, as desired. Conversely, let T ¢ 2<“ be any infinite tree in S, 
and fix X € S which has PA degree relative to T. By definition, X computes a path 
through 7, which is then in S. Since T was arbitrary, S — WKL. oO 


In computability theory, w-models are called Turing ideals. Those that satisfy WKL 
are called Scott sets, and those that satisfy the problem TJ are called jump ideals. 
We can now define a reducibility based on the above notions. 


Definition 4.6.4 (w-model reducibility). Let P and Q be problems. 


1. P is w-model reducible to a problem Q, written P <,, Q, if every w-model 
satisfying Q also satisfies P. 
2. P and Q are equivalent over w-models, written P =,, Q,if P <,, QandQ <,, P. 


In Chapter 5, we will find that w-models are indeed models in the usual sense, i.e., 
structures in a particular language satisfying a particular set of formulas. But for 
now we may think of them simply as mathematical objects facilitating the above 
definition. 


Proposition 4.6.5. 


1. <q is a transitive relation and =,, is an equivalence relation. 
2. If A is any set, then {Y : Y <q A} is an w-model. 
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3. If Ag <r Ay <r +++ are sets, then {Y : (Ai)[Y <7 A;]} is an w-model. 
4. If P admits computable solutions then every w-model satisfies P. 


Note that every w-model contains the computable sets. In fact, from part (2) above, 
we see that the collection of all computable sets forms an w-model. 


Definition 4.6.6 (The model REC). 


1. The w-model consisting of precisely the computable sets is denoted REC. 
2. A problem P is computably true if it is satisfied by REC. 


The terminology here is potentially confusing, so it is important to be careful. 
If a problem P admits computable solutions then it is computably true. But it is 
possible for a problem to be computably true yet not admit computable solutions. 
Nonetheless, such cases are sufficiently unusual or unnatural that many authors use 
the terms interchangeably. The following examples illustrate this. 


Example 4.6.7. Consider the assertion that every total function w — w is com- 
putable. In constructive mathematics, this principle is known as Church’s thesis (not 
to be confused with the Church—Turing thesis from Chapter 2). From a classical 
viewpoint, this principle is false. But it is not constructively disprovable, and we 
can see this fact reflected in our framework as follows. We can view Church’s thesis 
as a problem whose instances are all total functions f: w — w, with the solutions 
to any such f being all e such that f = ®,. This has instances with no solutions, 
but this is allowed for by the definition of instance—solution problems. In particular, 
these instances witness that the problem does not admit computable solutions. But 
every computable instance has a solution, and this is an element of w and so is also 
computable. Hence, the problem form of Church’s thesis is computably true. 


Example 4.6.8. Let P be the problem whose instances are all sets, with the sole 
solution to any computable instance being itself, and the solution to any noncom- 
putable instance X being X’. Then P is computably true but it does not admit 
computable solutions. Note, too, that this problem witnesses a failure of relativiza- 
tion: the statement, “every computable instance of P has a computable solution” 
does not relativize. However, P is clearly designed solely for this purpose, and in 
that sense is not “natural” in any real sense. Ergo, it makes a very unconvincing 
counterexample to Maxim 2.7.1. 


We move on to relate w-model reducibility with reducibility to multiple applica- 
tions. 


Proposition 4.6.9. Let P and Q be problems. If P admits computable solutions or, 
for some m > 1, P is computably reducible to m applications of Q, then P <q Q. 


Proof. Fix any w-model S & Q. If P admits computable solutions then it is satisfied 
by every w-model, so in particular S & P. Assume therefore that P is computably 
reducible to m applications of Q for some m > 1. Fix any instance X € S. By 
hypothesis, we obtain some Q-instance Xo computable from X. Since w-models 
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are closed under <y, this Xo belongs to S, and so it has some solution Yo in S. 
By hypothesis, we obtain some Q-instance X; computable from X @ Yo (assuming 
m > 1). Since w-models are closed under ® and <r, this Xj again belongs to S. 


Continuing, we find Q-instances X0,--+5Xm-1 with solutions Yo,..., ¥in—1, all in S. 
Now by hypothesis, X ®Yo@- - -®Y,,; computes some P-solution Y to X, necessarily 
in S. Since X was arbitrary, we conclude S F P, as desired. oO 


It follows that if P is reducible to Q via any of the notions of reduction discussed 
above then P <,, Q. Thus, <,, is the coarsest of our reducibilities. 

We can show that w-model reducibility is a strict coarsening by looking at the 
following problem that is arguably somewhat artificial but illustrates the power of 
w-models. 


Proposition 4.6.10. Let P be the problem whose instances are all pairs (X ,n) where 
X € 2 andn € a, with the unique solutions to any such (X,n) being X. Then 
P =,, Tu, even though P neither admits computable solutions nor is it computably 
reducible to m applications of TJ for any m. 


Proof. Note that TJ is identity reducible to P, so certainly w-model reducible. We 
next show that P <,, TJ. Fix any w-model S & TJ along with any P-instance (X, 7) in 
S. Then X € S, and since this is a TJ-instance, we must have X’ € S. But X’ is also 
a TJ-instance, so we must have X”’ € S, and so on. It follows that X’” € S for all 
m > 1, and hence in particular, the P-solution X‘”) to (X,n) belongs to S. SoS & P. 
To conclude, fix m > 1. We show P is not computably reducible to m applications 
of TJ. Consider the P-instance (@,m + 1), which is computable and hence in S. It 
is easy to see from Definition 4.5.1, that if P were reducible to m applications of TJ 
then the instance (@, m + 1) would necessarily have a solution Y <p @”). But this 
is false since its only solution is @°"*)), oO 


A more natural example is the following, which will be of further significance to us 
in Chapter 9, where we also give a proof. 


Proposition 4.6.11. For each n > 3, we have RT =, RT3, but RT neither admits 
computable solutions nor is it computably reducible to m applications of RT; for 
any m. 


Speaking of Ramsey’s theorem, notice that if we combine Proposition 4.5.9 with 
the fact that RT; is a subproblem of RT; for all m and all j < k, we obtain the 
following result, which is of independent interest. It is instructive to look at another 
proof. 

Corollary 4.6.12. For all n > 1 and j > 2, RT =u RT". 

Proof. Fix an w-model S of RT; along with k > 2. We claim that S & RT;. The 
proof is by induction on k. If k < j, the result is clear, and this includes the base 
case, k = 2. So, assume k > j and the result is true for k — 1. Let c: [w]" — k be 
an instance of RT; in S. Define d: [w]” — k — 1 as follows: 
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Zs c(x) ifc(x)<k-1, 
d(x) = 
k-—2 otherwise. 
Now, d <7 c, so d € S. Thus, we may fix an infinite homogeneous set Hg for d in 
S.If Hq is homogeneous with color i < k — 2, then it is also homogeneous for c, 
by construction. Otherwise, c(x) is either k — 2 or k — 1, for all x € [Hg]”. In this 


case, enumerate the elements of Hg as ho < h, < ---, and define e: [w]” — 2 as 
follows: 
0 ifc(hyy,...,hx,.,) =k -2, 
e(x0,---,Xn-1) = : 
(20 i) ( otherwise, 
for all (xo,..-,Xn-1) € [w]”". We have e <7 c @H, hence e € S. Since j > 2, 


S satisfies RT>, and so we may fix an infinite homogeneous set He for e in S. Let 
H = {h, : x € H.}, which is computable from H, ® Hg, and hence also belongs to 
S. Now if H- is homogeneous for e with color 0 then H is homogeneous for c with 
color k — 2, and if H. is homogeneous for e with color | then H is homogeneous for 
c with color k — 1. Oo 


We will reference this argument in Section 9.1. 

In practice, it is common for a problem P which is w-model reducible to a problem 
Q to be in fact computably reducible, and often even Weihrauch reducible, to Q. As 
illustrated in this chapter, the study of when one reduction holds and not another 
helps shade out differences (subtle and not subtle) between problems, and so it is 
also with w-model reducibility. But since this is the coarsest of our reducibilities, 
separating problems using it gives the strongest measure of difference. 

To understand such separations better, we now introduce an important method, 
called iterating and dovetailing, for building w-models with prescribed computability 
theoretic properties. We recall the definition of preservation from Definition 3.6.11. 


Theorem 4.6.13. Let C be a class of sets closed downward under <r, and let P and 
Q be problems. If P admits preservation of C but Q does not then there is an w-model 
S CC satisfying P but not Q. In particular, Q €u P. 


Proof. We build an w-model satisfying P but not Q. To this end, we first define 
sets Ag <r Ai <7 -::, as follows. Let Ag € C be a witness to Q not admitting 
preservation of C. Assume now that we have defined A; for some i € w and that 
A; € C. Say i = (e,s), so that s < i. If 4s is not a P-instance, let Aj.) = Aj. 
Otherwise, using the fact that P admits preservation of C, let Y be any solution to 
the P-instance o's (which is also an A;-computable instance) with A; ® Y € C. Let 
Aj+1 = Aj @Y. 

Define S = {Y € 2 : (Si) [Y <q Aj]}. By Proposition 4.6.5, S is an w-model, 
and we claim that it satisfies P. If X is any P-instance in S, then X = oe for some 
e and i. Then A(¢,;)4; computes a solution to X, which is consequently in S. This 
establishes the claim. 

It remains only to show that S does not satisfy Q. Let X be the Ag-computable 
Q-instance in C such that no solution Y to X satisfies Ao ® Y €C. Then X¥ € S, and 
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we claim it has no solution in S. Indeed, if some Q-solution Y to X were computable 
from some A;, then so would Ap ® Y . But as A; is in C, which is closed downward 
under <r, this would imply that Ap @Y € C. Oo 


Here, we see the reason for the name: we iterate the fact that P admits preservation of 
C to find the solutions Y above, “dovetailing” them into the construction by joining 
with the A;. 


Corollary 4.6.14. If P admits cone avoidance then for every C €r © there is an 
w-model satisfying P that does not contain C. In particular, TJ €., P. 


Proof. Let C be the collection of sets A so that C ¢7 A. oO 


Corollary 4.6.15. [f P admits PA avoidance then there is an w-model satisfying P 
that does not contain any set of PA degree. In particular, WKL €,, P. 


Proof. Let C be the collection of sets A > @. oO 


Another way to use Theorem 4.6.13 is to use complexity facts about a problem P 
to construct specific w-models satisfying that problem. 


Theorem 4.6.16. Fix any D > @. There exists an w-model S satisfying WKL such 
that D > Y for every Y € S. 


Proof. Let C be the class of all sets Y such that D >> Y and apply Proposition 3.6.12 
and Theorem 4.6.13. oO 


This result is often summarized by “WKL has w-models below every PA degree”. 
By taking a low set of PA degree, say, we can obtain an w-model satisfying WKL 
contained entirely in the class of low sets. 

In general, having many examples of w-models of a problem P (without explicitly 
involving another problem Q) is useful. It helps to produce separations when they 
are needed, and yields a more complete understanding of P. 


4.7 Hirschfeldt-Jockusch games 


We summarize all the reducibilities we have covered in Figure 4.6. There is an 
evolution of notions evident in the figure from left to right, with each reducibility 
being an elaboration on its predecessors, which are in turn special cases. We wrap up 
our discussion in this chapter with Hirschfeldt-Jockusch games, which encompass 
all our reducibilities in a single framework. 


Definition 4.7.1 (Hirschfeldt and Jockusch [148]). Let P and Q be problems. The 
Hirschfeldt-Jockusch game G(Q — P) is a two-player game in which Players I and 
II alternate playing subsets of w as follows. 
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Weihrauch to 
m applications 


ca 


computably to 
m applications 


identity 
reduction \ J 


Figure 4.6. Relationships between various problem reducibilities. An arrow from one reducibility 
to another indicates that if P is reducible to Q in the sense of the first, then it is also reducible in the 
sense of the second. No additional arrows can be added. 


subproblem —-> 


1. On move 0, Player I plays a P-instance X. 

2. On move 0, Player II plays either an X-computable P-solution Y to X and wins, 
or it plays an X-computable Q-instance Xo. 

3. On move i + 1, Player I plays a Q-solution Y; to Players II’s Q-instance x. 

4. On move 7 + 1, Player II plays a either an X ® %O---@ Y,;-computable solution 
Y to X, oran X © Yo @:--® Y;-computable Q-instance Xa. 


The game ends if Player II ever plays a P-solution to X. If the game ends, Player II 
wins, and otherwise Player I wins. 


There is a clear resemblance between these games and our reductions to multiple 
instances in Section 4.5. We make this explicit as follows. 


Definition 4.7.2. Let P and Q be problems. 


1. A strategy for Player I in G(Q — P) is a function that, on input 7 € w and the 
sets played by Player II on moves j < i, outputs a set Player I can play on move. 

2. A strategy for Player IT in G(Q — P) is a function that, on input i € w and 
the sets played by Player I on moves j < i, outputs a set Player II can play on 
move i. 

3. A strategy for Player I iswinning if on each move i when the game has not yet 
ended, playing the value of the strategy on i and the sets played by Player II on 
moves j <i, ensures victory for Player I. 

4. A strategy for Player II is if on each move 7 when the game has not yet ended, 
playing the value of the strategy on i and the sets played by Player I on moves 
J <1, ensures victory for Player II. 

5. A strategy is computable if it is a Turing functional. 


Proposition 4.7.3 (Hirschfeldt and Jockusch [148]). Let P and Q be problems. 


1. P admits computable solutions if and only if Player II has a winning strategy in 
G(Q — P) that ensures victory in one move. 
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2. Puniformly admits computable solutions if and only if Player II has a computable 
winning strategy in G(Q — P) that ensures victory in one move. 

3. For m > 1, P is computably reducible to m applications of Q if and only if 
Player II has a winning strategy in G(Q — P) that ensures victory in exactly m 
moves. 

4. For m > 1, P is computably reducible to m applications of Q if and only if 
Player II has a computable winning strategy in G(Q — P) that ensures victory 
in exactly m moves. 


We can similarly express being a subproblem and being identity reducible in terms 
of winning strategies (see Exercise 4.8.14). That only leaves w-model reducibility. 
We now characterize this in terms of games, too. 


Lemma 4.7.4. If P and Q are problems with P <q Q, then either P admits com- 
putable solutions or every instance X of P computes an instance of Q. 


Proof. Fix a P-instance X, and suppose X computes no solution to itself. If X 
computed no instance of Q, then {Y € 2” : Y <y X} would be an w-model that 
(trivially) satisfies Q but not P, which cannot be. oO 


Theorem 4.7.5 (Hirschfeldt and Jockusch [148]). Let P and Q be problems. If 
P <,, Q then Player II has a winning strategy in G(Q — P), and otherwise Player I 
has a winning strategy. 


Proof. Suppose P <,, Q. We describe a strategy for Player I. On a move i when 
the game has not yet ended, Player I will have played a P-instance X and, for each 
j < i, a Q-solution Yj. Now, if there is a P-solution Y to X computable from the 
join of X and the Y; for 7 < i, then Player II can play it and win. If not, then fix an 
X-computable Q-instance XK. Say i = (e, 5), so that s < i. If ®e, with oracle the join 
of X and the Y; for i < s, 1s a Q-instance, then Player II plays it as X;. Otherwise, 
Player II plays X as X;. 

We claim Player II wins. Suppose not, so that the game goes on forever. Let S be 
the set of all sets computable from X © Yp @::: Y; for some i. Then S is an w-model, 
and by assumption, S contains X but no P-solution to X. In particular, S does not 
satisfy P. Now suppose Xis any Q-instance in S, say equal to ®, (X@YO- OY, 7”) 
for some e and some s > 0. Then X= Xi for i = (e, s), which means that Y; isa 
Q-solution to X. Since Y; € S, and since X was arbitrary, it follows that S satisfies Q. 
This contradicts P being w-model reducible to Q. We conclude that Player II wins, 
as claimed. 

In the reverse direction, suppose that P €,, Q. Fix an w-model S satisfying Q but 
not P. We describe a strategy for Player I. On move 0, Player I plays a P-instance 
X in S with no solution in S, which exists since S # P. Thus, Player II must play a 
Q-instance on move 0, and this Q-instance is necessarily in S. Now suppose that the 
game has not ended through move i, and that all sets played on or before this move 
belong to S. Then Player II must have played a Q-instance X; € S on move i, and 
since S & Q, Player I can play a Q-solution Y, € S to X; on move i+1. Since Player II 
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must play computably in the join of the finitely many sets played by Player I, it must 
play a set in S. But then it cannot win on move i + | either. We conclude that the 
game never ends, and hence Player I wins. oO 


4.8 Exercises 


Exercise 4.8.1. A problem P admits universal instances if for every A € 2, either P 
has no A-computable instance, or there exists an A-computable instance X” such that 
if Y is any solution to X“ then A@Y computes a solution to every other A-computable 
instance. Show that WKL admits universal instances. (We will see another example 
in Corollary 8.4.14.) 


Exercise 4.8.2. Show that WKL =,w WKL. 


Exercise 4.8.3. Consider the restriction of IPHP to instances (fe : e € w) such that 
lim, f-(x) exists for every e (i.e., each f, is eventually constant). Shows that this 
problem is computably reducible to TJ. 


Exercise 4.8.4. Prove Proposition 4.3.7. 


Exercise 4.8.5. 


1. Show that if X >> @, then every computable, computably bounded tree T C w<@ 
has an X-computable path. 

2. Let P be the problem whose instances are pairs (T,b) such that T is a b- 
bounded infinite tree, with the solutions to any such pair being all the infinite 
paths through 7. Show that P =, WKL. 


Exercise 4.8.6. Prove Proposition 4.4.3. 


Exercise 4.8.7. Prove Proposition 4.5.2 and formulate, and prove, a uniform ana- 
logue. 


Exercise 4.8.8. Prove the compositional product, *, is associative. 
Exercise 4.8.9. Construct a jump ideal that is not a Scott set. 


Exercise 4.8.10. Recall the problems Cy and C,, from Exercise 3.9.9. For all n > 
m > 2, prove the following. 


1. Cm <sw Cn <sw Cw for all m <n. 
2. Cn <w Cn <w Cn for all m <n. 


Exercise 4.8.11 (Pauly [247]; Brattka and Gherardi [19]). Let P and Q be prob- 
lems. Define the following problems. 


1. PLUQ has domain all pairs (0,x) for x € dom(P) and (1, y) for y € dom(Q), 
with the solutions to (0,x) being all elements of {0} x P(x), and the solutions 
to (1, y) being all elements of {1} x Q(y). 
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2.PMQ has domain dom(P) x dom(Q), with the solutions to (x, y) being all 
elements of ({0} x P(x)) U ({1} x Q(y)). 


Show that P LI Q is the join (i.e., least upper bound) of P and Q under <w, and that 
Pm Qis their meet (i.e., greatest lower bound). Show that the same is true under <c¢. 


Exercise 4.8.12. Let C be a class of sets closed downward under <r. Let Pg and 
P; be problems, each of which admits preservation of C. Show that Po + P; admits 
preservation of C. (So in particular, if Q is a problem that does not admit preservation 
of C, then Q ¢,, Po + P}.) 


Exercise 4.8.13. Show that there is an w-model of WKL consisting entirely of sets 
of hyperimmune free degree. 


Exercise 4.8.14. Formulate variants of the game G(Q — P), or conditions on win- 
ning strategies, so as to obtain a characterize in the style of Proposition 4.7.3 of a 
problem P being a subproblem of Q, or of being identity reducible to Q. 
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In the previous chapter, we discussed reducibility notions based on computability 
theory. Another method for comparing problems relies on provability: if we tem- 
porarily assume as an axiom that a problem P is solvable, how difficult is it to prove 
that a second problem Q is solvable? If we can prove that Q is solvable under the 
assumption that P is solvable, this gives us information that Q is “weaker” than P, at 
least modulo the other axioms used in our proof. 

Viewing problems as representations of mathematical theorems, we can rephrase 
this second approach as: how hard is it to prove a particular theorem T under the 
assumption that we are allowed to invoke a theorem S as many times as we like? 
There is a clear analogy with w-model reducibility, but now we must also consider 
models with a nonstandard first-order part, and must prove T. 

In this chapter we introduce the most common framework for this method: sub- 
systems of second order arithmetic. The use of these subsystems has been part of the 
field since its inception, and the systems are sometimes viewed as a defining charac- 
teristic of reverse mathematics. In Section 5.11, we give a more detailed comparison 
of computability theoretic reducibilities with reducibilities based on formal systems. 


5.1 Syntax and semantics 


Second order arithmetic is a collection of theories in two sorted first order logic. 
Elements of the first sort will be called numbers, while elements of the second sort 
will be called sets. We assume a familiarity with basic definitions and results of first 
order logic. In the remainder of this section, we discuss how these definitions are 
specialized to the particular case of second order arithmetic. 


Definition 5.1.1 (Signature of £,). The signature of second order arithmetic, Lo, 
consists of the following. 


¢ Constant number symbols 0 and 1. 
¢ Binary function symbols + and - on numbers. 
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¢ Binary relation symbols < and = for numbers. 
¢ A set membership relation € taking one number term and one set term. 


The intended interpretation of the language, of course, is that number variables 
should range over the set w of (standard) natural numbers, set variables should range 
over the powerset of w, the constants and arithmetic operations should be the standard 
ones, < should be the order relation on w, and € should denote set membership. 

The equality relation for sets is not included in the signature £4. Instead, we treat 
it as an abbreviation: 


X=Y=(V2)[zex ozeyY]. 


The reasons for this omission are discussed in Remark 5.1.8. We will silently assume 
the necessary definitional extension has been made whenever we use the set equality 
symbol in a formula. 

The syntax of second order arithmetic, including the sets of terms and (well 
formed) formulas, is built up from the signature £2 in the usual manner. There are 
many numeric terms (for example, | + 1 + 1 and x + y + 0) but, because there are 
no term forming operations for sets, the only terms for sets are the set variables 
themselves. Rather than using superscripts to indicate the sort of each variable, as 
might be common in type theory, we will usually use lowercase Roman letters for 
number variables and uppercase Roman letters for set variables. (We may deviate 
from this in specific instances, when the type is understood from context.) 

Formulas in £2 may have several kinds of quantifiers. There are universal and 
existential quantifiers for number variables and for set variables. There are also the 
bounded quantifiers of the form Vx < t and dx < ft, where x is a number variable 
and f is a number term. As usual, these are equivalent to slightly more complicated 
formulas with ordinary number quantifiers, 


(ax < thy = (Ax) [x < tA gl], 
(Vx < thy = (Vx) [x <t— g]. 


An £5 theory is simply a set of sentences in the signature £2. As usual, if T is 
an £5 theory and y is an £2 sentence, we write T + y if there is a formal proof of y 
from T in one of the standard effective proof systems for two sorted first order logic. 

The semantics of second order arithmetic are defined with a certain kind of first 
order structure. 


Definition 5.1.2 (£, structures). An £5 structure is a tuple 
Mah Sa OM Ae"), 


where M is a set that serves as the domain for quantifiers over number variables, 
S™ C P(M) serves as the domain for quantifiers over set variables, 0“ and 1™ are 
fixed elements of M,+™ and -™ are fixed functions from M x M to M , and <M is 
a binary relation on M. 
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The ¢€ relation is interpreted using the usual set membership relation. Exer- 
cise 5.13.7 explains why there is no loss of generality in this convention. 

There are a number of standard effective deductive systems for two sorted first 
order logic, which are equivalent in terms of their provable sentences. We will 
assume one of these systems has been chosen, giving a notion of formal derivability 
and a notion of syntactic consistency. We will use the term consistent to refer to this 
notion of syntactic consistency. 


Definition 5.1.3 (Parameters). Let M be an £2 structure and let 8 be a subset of 
MUS™. Then L£3(8) denotes the language £ expanded by adding a new constant 
symbol (of the appropriate sort) for each element of 8. The new constants are called 
(first order or second order) parameters from M. If 8 = MUS™, we write £2(M). 
If 8 is finite, say B = {Bo,..., Be-1}, we write £L2(Bo,..., Bx-1). 


Definition 5.1.4 (Satisfaction). Let M be an £2 structure and 8 a subset of MUS M 
Let y be a formula of £(8). Then the satisfaction relation M & y is defined in the 
usual way via the T-scheme, with each B € 8 interpreted in M by B. 


At the syntactic level, we always stay inside £2. But we say a formula y “has 
parameters’, or words to this effect, to mean that it has free variables, which could be 
substituted by actual parameters in £2(8) for some M and S. If we prove y in some 
£> theory T, then of course we are proving the universal closure of y. So, if M & T, 
then M will satisfy every formula resulting from substituting actual parameters for 
the free variables of yg. In this sense, the “identity” of the parameters does not matter 
in our syntactic treatment, though of course their presence or absence does. 

The main purpose for formalizing second order arithmetic in first order logic is 
to ensure the following fundamental metatheorems of first order logic are applicable 
to our system. 


Theorem 5.1.5. The following hold. 


e(Soundness theorem): [fan L2 theory is satisfied by any Lp structure then it is 
(syntactically) consistent. 

e (Completeness theorem): [fan L theory is consistent then it is satisfied by some 
L£ structure. 

e(Compactness theorem): An £2 theory is consistent if and only if each of its 
finite subtheories is consistent. 

e (Downward Lowenheim-Skolem theorem): [fan L2 theory T is consistent, then 
it is satisfied by some L5 structure M in which both M and S™ have cardinality 
|T| + No. 


In Chapter 4, we discussed w-models in the guise of Turing ideals. In the context 
of second order arithmetic, there is a slightly broader notion: an w-model is typically 
defined to be an arbitrary submodel of the standard model, with the same first order 
part but possibly a smaller collection of sets. However, we will see in Theorem 5.5.3 
that an w-model M in that sense satisfies the particular theory RCAo if and only 
if S™ is a Turing ideal. Because we will generally be concerned only with models 
of RCApo or stronger theories, we will require the sets in an w-model to be a Turing 
ideal. 
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Definition 5.1.6. 


1. An w-model is an £2 structure M in which M = w, the arithmetical operations 
are the standard ones, and S™ is a Turing ideal. 

2. More generally, if N is a submodel of M in which N = M and SY c S™, then 
N is an w-submodel of M, and M is an w-extension of N. In this case, it is not 
required that M is actually w. 


An important fact, which we will see generalized in Theorem 5.9.3, is that if NV is an 
w-submodel of M then NV and M satisfy the same arithmetical sentences (including 
with set parameters from SY). See Exercise 5.13.1. 


Remark 5.1.7. Because an w-model is completely determined by the collection of 
subsets of w it contains, we usually identify the w-model with this collection of 
subsets. In this sense, our prior definition of w-model and the one given above are 
the same. We will follow this convention moving forward and not draw a distinction 
between the two notions of an w-model. 


The standard model of second order arithmetic has w as its first order part and 
the powerset of w as its second order part. The class of nonstandard w-models is of 
particular interest because these share the same numerical properties as the standard 
model, differing only in the collection of sets. In particular, although any model can 
serve to show that a certain implication fails, a counterexample given by an w-model 
shows that the reason for the failure is second order rather than first order. We will 
see that there are implications which fail in general but which hold in every w-model, 
so knowing that a nonimplication is witnessed by an w-model provides additional 
information. 


Remark 5.1.8 (Set equality). There is some variation in the literature about whether 
to include the set equality relation in the signature of second order arithmetic. From a 
purely foundational viewpoint, there is no harm in including it. There are two reasons 
we do not include it, which each involve the implicit quantifier in set equality. 

The first reason relates to computability of AY sentences. Provided set equality is 


. = TH: 0 . . 
not in the language, suppose that y(X, y, Y) is a X; formula with a list of number 


parameters y and a list of set parameters Y, and W(X, y, Y ) is an equivalent formula 
that is mi. Then there is a computable functional F with codomain {0, 1} so that, for 
every X, F(X,y,Y) = 1 if and only of y(X,y, Y) holds. On the other hand, given 
any set Y, there is no computable way to decide the formula y(X, Y) = X = Y. If set 
equality were included in the language, the formula X = Y would be quantifier free, 
rather than m1 as it is under our conventions. 

The second reason is the analogous issue with forcing. With our conventions, it is 
usually trivial to decide the relation p i+ y when g is quantifier free. If formulas such 
as X = Y were quantifier free, the definition of forcing would be more complicated. 

The set equality relation is closely related to axioms of extensionality. In second 
order arithmetic, because the only relation symbol involving sets is €, we automati- 
cally have an axiom of extensionality (Exercise 5.13.8): 


5.2 Hierarchies of formulas 111 
(Vn)[ne X one Y] > [9(X) © V(Y)]. 


In higher-order arithmetic or arithmetic based on function application rather than set 
membership, there are a number of extensionality axioms that arise in practice (see 
Kohlenbach [183] and Troelstra [313]). 


5.2 Hierarchies of formulas 


In Section 2.6, we classified sets and relations into an arithmetical hierarchy that 
measures the level of noncomputability of the relations included in that hierarchy, in 
a manner made precise by Post’s theorem. 

In this section, we define a parallel hierarchy that classifies formulas based on 
alternations of quantifiers. This leads to a second way to understand the complexity 
of sets and relations. The relationship between the two, which we discuss later in 
this chapter, is a key motivation for using second order arithmetic to study reverse 
mathematics. 


5.2.1 Arithmetical formulas 


An £5 formula is arithmetical if it has no set quantifiers. Arithmetical formulas may 
still have free set variables and, in the extended language for an £2 structure, they 
may have set constants. The next definition assigns finer classifications to some (but 
not all) arithmetical formulas. The possible classifications are “£2” for n € w and 
“119” for n € w. 


Definition 5.2.1. The arithmetical hierarchy is a classification of certain £2 formu- 
las. 


¢ A formula with only bounded quantifiers is given the classifications ps and I. 
* If y is X2 then (Vx) is given the classification m1? i 
¢ If y has classification is? then (Ax)y is given the classification x aL 


Here (Vx) and (Ax) represent finite sequences of universal existential number quan- 
tifiers, respectively. To emphasize that a formula has a set parameter B, we use the 
notations £28, 11), ©°(B), or T1°(B). 


Not all arithmetical formulas are classified in the arithmetical hierarchy: the 
definition only assigns a classification to formulas in prenex normal form. However, 
a formula not in that form may still be logically equivalent to a formula with a 
classification. For example (Ax)(x = x) — (Vy)[y = y] is logically equivalent to 
(Vx)(Wy)[x =x — y = y], which is a m1 formula. By the usual result that every 
formula is equivalent to a formula in prenex form, every arithmetical formula is 
logically equivalent to some formula in the arithmetical hierarchy, possibly after 
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rewriting bounded quantifiers in a equivalent way using unbounded quantifiers if 
necessary. A deeper analysis of the interaction between bounded quantifiers and 
unbounded quantifiers is given in Proposition 6.1.2. 

Similarly, because we may prefix a formula with dummy quantifiers, every ny 
formula is logically equivalent to a Dy formula, a my formula, and so on. Thus, given 
a formula y, we will usually be interested in the minimal n such that ¢ is equivalent 
to some 2 or T° formula. 

We will sometimes abuse definitions by claiming that a formula ¢ has a classi- 
fication when in fact y is only equivalent to a formula with that classification. This 
is not as safe as it may seem: a key phenomenon is that a formula y may not be 
logically equivalent a formula yw, but a nontrivial theory may prove that y — wW, so 
that when we work within that theory we can treat ¢ as if it has the classification of 
w. For example, the statement that a binary tree has an infinite path is not, naively, 
an arithmetical statement, but over the subsystem WKLo this statement is equivalent 
toa m1 formula stating the tree is infinite. Similarly, we may have an equivalence 
go wv under the combined assumptions at a particular place in a proof, although ¢ 
and w are not equivalent in general. 

The following three results link the arithmetical hierarchy of relations from Defi- 
nition 2.6.1 and the arithmetical hierarchy for formulas from Definition 5.2.1. They 
also give a sense of how computability theory can be formalized into second order 
arithmetic. The proofs are left to Exercise 5.13.17. 


Lemma 5.2.2. For each primitive recursive function f(n, h) there is a = formula 
p(n, z,h) anda TI? formula w(n, z, h) such that, for all h, n, and z, 


f(i,h) =z p(a,z,h) O W(H,z, h). 


Applying the lemma to Kleene’s T and U functions yields the following theorem 
on the formalization of computability theory. 


Theorem 5.2.3. For each k, there is a = formula p(e,n,z,h) and a mm formula 
W(e,n, z, h) such that for all e, h, n of length k, and z, 


gi (i) =z © ple,n,z,h) © We, i, z, h) 


Combining the previous theorem with Post’s theorem gives a tight connection 
between the two versions of the arithmetical hierarchy. 


Corollary 5.2.4. Let B € w be arbitrary and n > 0. A set X € wis ae (or T?-3) 
if and only if it is definable by a a (or mn. respectively) formula of Lo. 


This corollary is extremely useful for producing upper bounds on the classifi- 
cation of a set in the arithmetical hierarchy. Given a formula that defines a set, 
routine manipulations can put the formula into prenex normal form, at which point 
we can count the number of quantifier alternations to determine an upper bound. 
Rogers [259] called this method the Tarski-Kuratowski algorithm. 
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5.2.2 Analytical formulas 


The arithmetical hierarchy is limited to formulas with no set quantifiers. The an- 
alytical hierarchy is an analogous classifications for formulas that may contain set 
quantifiers. While the arithmetical hierarchy begins with bounded quantifier formu- 
las at the bottom level, the analytical hierarchy begins with arithmetical formulas at 
the bottom level. The possible classifications in the analytical hierarchy are =! for 
né wand} forn € ow. 


Definition 5.2.5. We inductively assign classifications as follows. 


e Every arithmetical formula is given classifications 2 and Il}. 
If gis =} then (VX)y is given the classification 1H eer 
* If y has classification M1! then (AX)y is given the classification pa 


Here (VX) and (3X) represent finite sequences of universal and existential set 
quantifiers, respectively. To emphasize that a formula has a set parameter B, we 
write Ee 1 Fea =! (B), or 1! (B). 


Each formula that has a classification in the analytical hierarchy is in a special 
kind of prenex form in which the set quantifiers appear first, followed by the number 
quantifiers, followed by a matrix which possesses only bounded quantifiers. Unlike 
the arithmetical case, we cannot simply use prenex normal form to place a formula 
into the analytical hierarchy, because we cannot easily move a number quantifier 
inside a set quantifier. We will soon discuss a system ACAg which is strong enough 
to allow that kind of quantifier rearrangement. 


5.3 Arithmetic 


The goal of the next two sections is to describe axiom systems that are used to 
develop theories of second order arithmetic. The axioms can be divided into two 
kinds. The first order axioms ensure that basic number theoretic properties from 
w will hold for the models we study, while the second order axioms describe the 
interaction between the numbers of a model and its sets. 


5.3.1 First order arithmetic 


We begin with the first order axioms, which are those that can be stated in a more 
restricted language. 


Definition 5.3.1. The language L£, of first order arithmetic contains two binary 
function symbols, + and -; two constant symbols 0 and 1; an order relation symbol 
<; and an equality relation symbol =. The set of £; formulas is defined using this 
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signature and number quantifiers only. Thus an £2 formula is an £; formula if and 
only if it contains no set variables whatsoever. 


Every £; formula is also an £2 formula. The definition of the arithmetical hierarchy 
(Definition 5.2.1) thus extends naturally also to £; formulas. To be sure, there is a 
difference between £, and £2 even for arithmetical formulas, since the latter can 
include parameters. But we tend to use the classes 2° and II to refer to formulas 
both in £; and L5, and rely on context to distinguish between the two when it 
matters. (See also the discussion at the beginning of Chapter 6.) 

Peano arithmetic is often defined using only a successor operation and a few 
axioms including a second order induction axiom. In the first order setting, it is 
necessary to include the addition and multiplication relations from the start, because 
neither of these relations is first order definable over the structure (w, S$). To simplify 
working with restricted induction axioms, it is common to include a longer list of 
basic axioms (see also Kaye [176]). 


Definition 5.3.2. The first order theory PA™ in the signature £, with the following 
axioms, which describe a discrete ordered semiring. 


. (Vx, y, 2 [xty)+z=x+(y +2z)]. 

. (Vx, y)[xt+y =ytx]. 

. (Vx, y,z)[@+y)-z=x-(y-Z)]. 

. (Vx, y)[x-y=y-x]. 

. (Vx, y,z)[e-(y+z)=x-y+x- Zz]. 
.(Wx)[xt0=xAx-0=0Ax-1=x]. 
(Vx, y,z[x<yAy<z7x<zZ]. 

. (Wx) [x € x]. 

. (Wx, y)[x<yVx=yVy <x]. 

. (Vx, y,z) [x <yroxtz<yrtgl. 

. (Wx, y,z)[0<zAx<yox-z<y-Z]. 
. (Vx, y)(Az) [x <yaxtz=y]. 
.(0< 1) A (Wx)(x >0 > x 2 1). 

. (Vx) [x =OVx > O]. 


See ee 
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Here we have adopted the conventions that a ¢ b abbreviates =(a < b) anda < b 
abbreviatesa=bVa <b. 


The theory PA” captures the basic properties of the natural numbers as a discrete 
ordered semiring. To verify stronger properties, we will use induction axioms. 


Definition 5.3.3. Peano arithmetic is the first order theory in the signature L£, that 
is obtained by adjoining to PA™ every instance of the induction scheme 


(p(x) A (¥x)[¢(x) > (S(@a)))) > (Vx) ge) (5.1) 


in which g is a formula of £1. 
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The key role of first order induction axioms in reverse mathematics is to verify 
number theoretic properties of the model. We end the section with one of the most 
important of these, which is the existence of a pairing function that codes pairs of 
numbers by numbers. We already fixed such a function in Section 1.5, but we also 
need a pairing function in our formal theories of arithmetic. This is accommodated 
by the following theorem. 


Theorem 5.3.4. PA” together with induction for all x formulas of £; proves that 
the function (n,m) +> (n+m)? +m is an injection from pairs of numbers to numbers. 


Of course, PA, as an £; theory, cannot speak of functions directly. So what the 
above means is that PA~ can verify the properties of (n+m)* +m relative to n and m 
that make the map above an injective function. (For example, that for all n,n*,m,m’*, 
if (n+ m)* +m = (n* + m*)? + m*, then n = n* and m = m*.) The proof is standard, 
but of limited interest for our present purposes. We refer the reader Simpson [288, 
Theorem II.2.2] for complete details. 

As in our informal discussion, we denote the code for the pair (n,m) by (n,m). 
So, (n,m) = (n+m)? +m. This makes the definition of (n,m) simple, since it is just 
aterm of £,, and therefore easy to work with. A downside is that, unlike the pairing 
function discussed in Chapter 1, this function is not surjective. This is not a problem 
for proving the basic results we need to get our discussion underway, however, most 
notably, the coding of finite sets in Section 5.5.2. In short order, we will have the 
tools to also define a bijection N x N — N (see Exercise 5.13.25), and at that point 
we may use (n,m) to refer to this bijection if convenient. 


5.3.2 Second order arithmetic 


In an arbitrary £2 structure M, the collection of sets may be much smaller than 
the powerset of M. Thus, in defining theories of second order arithmetic, we must 
include explicit axioms stating that particular sets exist, to guarantee the sets we wish 
to use will be present in the models of our theories. Informally, axioms that imply 
the existence of sets will be called set existence axioms. 

By carefully choosing which set existence axioms are included (along with other 
axioms), we will be able to construct theories of various strengths. In particular, we 
will consider axioms that state that if a set X exists, and Y is definable by a formula 
with X as a parameter, then Y also exists. 

This manner of speaking deserves a small comment, because it is trivial that the 
set Y will exist in the usual mathematical sense. When we speak of sets that “exist”, 
we are thinking in the context of some fixed (possibly nameless) £2 structure: a set 
exists if it is a set in this structure. 

Usually, we will try to be precise and only call something a set if it does exist 
in this sense. We will then refer to every other collection of numbers as just that 
(or, following set theoretic terminology, as a class). But occasionally, it may be 
convenient to use the word “set” informally, and here we will rely on the word 


116 5 Second order arithmetic 


“exists” to distinguish our meaning. For example, we will often encounter situations 
where not every xt definable collection of numbers exists. Here, we may refer to 
such a collection as a “xt definable set’, so long as we are careful not to state that 
the collection exists, or to treat it as if it does. 


Definition 5.3.5. Let y be a formula in the language Lo. 


1. The comprehension axiom for ¢ is the universal closure of 
(AX)(Vx) [x € X & g(a)], 


where X is a set variable not mentioned in y. The formula y may have free set 
variables, which serve as parameters relative to which X is defined. 
2. The induction axiom for ¢ is the universal closure of 


((0) A (¥x) [e(x) > g(x + 1)]) > (Wx) g(x). (5.2) 


3. For 'acollection of formulas of £2, -CA is the axiom scheme consisting of the 
comprehension axiom for every y € I, while IT is the axiom scheme consisting 
of the induction axiom for every y € T. 


Comprehension axioms are typical set existence axioms. There are other set existence 
axioms, however, as we will comment on below. 


Definition 5.3.6. The theory Z2 of full second order arithmetic includes the axioms 
of PA’, the induction axiom scheme for all £2 formulas, and the comprehension 
axiom scheme for all £2 formulas. 


Z» is an amazingly strong theory. Indeed, it is somewhat challenging to find a 
mathematical theorem that can be naturally expressed as an £> formula but cannot be 
proved in Z». Of course, there are well known examples, such as Con(Z2), that cannot 
be proved in Z2 due to incompleteness. A somewhat more mathematical example, 
due to Friedman [107], is that determinacy for xe games is not provable in Z2 (see 
Section 12.3.4). Other results known to be provable in ZFC (Zermelo—Fraenkel set 
theory with the axiom of choice) but not Z, are of similar character: they are familiar 
to set theorists but not to most undergraduate mathematics students. 

Because Zp» is so strong, we consider fragments obtained by weakening the col- 
lection of comprehension and induction axioms that may be used. These subsystems 
of Z) are a key subject of (classical) reverse mathematics. A typical such subsystem 
has the shape 

PA” + aset existence axiom or scheme + ba 


with the set existence scheme often (but not always) being ’-CA for some collection of 
formulas I’. The motivations for restricting the induction axiom (5.2) to x formulas 
are discussed in Section 6.6. 

T-CA proves, for each y € I, that a set X = {n : y(n)} exists. The weak induction 
scheme [xt already includes the set induction axiom 


Iset: (O € X A (Vn)[n € X > n+1€ X]) > (Vn)[ne€ X]. 
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Thus combining T’-CA with Ix? gives all of II. In particular, if T 3 = then all of 


the theories PAT +I’-CA + Pp PA” +I-CA + |,.,, and PAT + -CA + ID are the same. 


Remark 5.3.7 (The subscript 0). The main subsystems of Z) encountered in reverse 
mathematics are all named for their respective comprehension axiom or scheme 
and decorated with a subscript 0 to indicate the presence of PA” and [z?. Thus, for 
example, TI;-CAo denotes the system PA™ + TI;-CA + bar etc. Sometimes, however, 
dropping the subscript is used to indicate not the weaker system consisting of the 
comprehension axiom alone, but the stronger system obtained by the addition of 
induction for all formulas of £2. There is no good convention about this in the 
literature, which can lead to some confusion. We will largely avoid the latter practice 
in this book, but in any case will always be explicit when the subscript 0 is omitted. 


5.4 Formalization, and on w and N 


The language £2 lends itself naturally to formalizing various statements about w 
and subsets of w. Once we develop a system for coding finite sequences by numbers 
in a weak subsystem of Zz (see Section 5.5.2), we will be able to formalize more 
complicated objects defined in terms of w, like sequences of sets of numbers, the set 
2<, functions on w of arbitrary arity, etc. 

It is traditional, when arguing about some such formalized object in a subsystem 
of Zz, to use the symbol N in place of w. In this way, we reserve w for the bona fide 
set of natural numbers. In Zo, we instead speak of 2<) about functions N > N, etc., 
because the numbers in an arbitrary model may be nonstandard. Formally, these are 
abbreviations for certain definitions in £2, and hence care must taken that the objects 
so defined actually exist, in the sense discussed at the beginning of this section, in 
whatever formal theory we are employing. 

Yet a third notation is used when dealing with structures. In an £2 structure M we 
will denote the “set of numbers” by M, and then speak of the set 2<™, of functions 
M — M, etc. Here, we must again bear in mind that M may be nonstandard, 
which can have implications for even some very basic notions (discussed in detail in 
Chapter 6). Of course, there is a tight connection between this and how much of Z» 
holds in M. 

We move freely between these notations when discussing definitions, theorems, 
and proofs. In practice, this simply involves switching between w, N, or M (for a 
structure M). Formally, of course, there is more going on. For example, consider 
the infinitary pigeonhole principle from Definition 3.2.3. As an VA theorem, this 
states that for every function f: w — w with bounded range there is ani € w such 
that f~!{7} is infinite. In practice, we would write this in Z) as “for every function 
f: N > N with bounded range there is ani € N such that f~! {7} is infinite”. But in 
actuality, we are using the fact that Z) can code pairs of numbers, and interpreting 
the preceding as shorthand for the more opaque formula 
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(VX)[ ( (Vx)(Ay)[(x, y) € X] 
A (Wx) (Vy)(Vz) (x,y) € XA (x, z) EX Dy =z] 
A (Az)(¥x) (Vy) [x y) €X ay <z] ) 
= (Ay)(Vz) (Ax) [x > zA (x, y) € X] J. 


5.5 The subsystem RCAo 


The weakest subsystem of Z2 we will consider is called RCAo. The initialism stands 
for “recursive comprehension axiom”. As usual throughout the reverse mathematics 
and computability literature, “recursive” in this sense is asynonym for “computable”. 
And, as we will see, RCAg roughly corresponds to a formalization of computable 
mathematics. 


5.5.1 AY comprehension 


Definition 5.5.1. The a comprehension scheme consists of the universal closure of 
each axiom of the form 


(Yn) [p(n) & w(n)] > (AX) [n€ X & g(n)]. 


Intuitively, this axiom says that if a set X is defined by a = formula and also by a 
nm? formula, then we may assert that X exists. We say that X is A definable relative 
to the parameters of the formulas. 

From the lens of computability theory, a set that is both a and nm is com- 
putable relative to the parameters of the formulas. Accordingly, the MM comprehen- 
sion scheme can be informally rephrased as: if a set X is computable relative to other 
sets that exist, then X also exists. 


Definition 5.5.2. The formal system RCAo consists of the basic numerical axioms 
PA’, the A comprehension scheme, and the induction axiom (5.2) restricted to x? 
formulas. 


With reference to Remark 5.3.7, we note that RCA in the literature often refers to 
RCAp together with full induction. 

The analogy between RCAo and computable mathematics is made more precise 
by the following theorem. It is common, when trying to understand the spirit of a 
subsystem, to consider the special case of w-models. This allows us to temporarily 
ignore the first order part of the theory and focus specifically on its set existence 
strength. The following theorem also justifies our restriction of w-models in Defini- 
tion 5.1.6 to models in which S™ is a Turing ideal. 
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Theorem 5.5.3. Suppose that M is a submodel of the standard model of Zo. Then 
M & RCAv if and only if S™ is closed under Turing join and relative computability 
(i.e., if and only if M is a Turing ideal). 


Proof. First, assume M satisfies the Ao comprehension scheme. If A, B € S™, then 
A® Bis in S™, because A © B is A° definable from A and B. Similarly, if A € SM 
and B <; A then B is AY definable from A and hence B ¢ S™. 

Conversely, assume that S™ is a Turing ideal. If a set B is AY definable relative to 
asequence A\,..., A, of sets in S™ then Bis My definable from A = A, @---@A, € 
S™ But then B is computable from A, so B is in SS, as desired. oO 


In particular, we have the following basic but important observation. Recall the 
definition of REC from Definition 4.6.6. 


Corollary 5.5.4. 


I, REC & RCAo. 
2. More generally, given any set X € 2, the w-model (with second order part) 
S={Y €2°:Y <_ X} is a model of RCAo. 


An w-model as in (2) is called a topped model, and is said to be topped by X. (Thus, 
REC is a topped model topped by any computable X.) We caution that this is not the 
same as an w-model S such that, for some X, we have Y <y X for all Y € S. We 
will see many examples of models of the latter kind, but these will only be topped 
provided X itself belongs to S (which it usually will not). 


5.5.2 Coding finite sets 


In contrast to w-models, nonstandard models of second order arithmetic, like non- 
standard models of first order arithmetic, exhibit unusual behaviors that one must be 
cautious about. Perhaps none is more striking than the distinction between “bounded” 
and “finite”. For this reason, we need a more precise definition of finite set. 

We previously described the pairing function (n,m), which will be a key tool. The 
method we will employ for coding finite sequences dates back to Gédel [125, 316], 
who showed that a version of the Chinese remainder theorem could be formalized 
in weak systems of arithmetic. To avoid repeating proofs that are present in numer- 
ous places in the literature, we will continue to refer to the development given by 
Simpson [288]. Another development of finite sets and sequences in weak systems 
of arithmetic is given by Hajek and Pudlak [134, Chapter I]. 


Definition 5.5.5. A number c represents a set X if there are k, m, and n such that 
c = (k, (m,n)) and, for all i, we have i € X if and only if 


(i < k) A (m- (i+ 1) + 1 divides n). (5.3) 
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The code for a set X is the least c that represents X, if such a number exists. A set X 
is coded if it has a code. 


The key property of the definition is that the property in (5.3) can be expressed 
with a formula of arithmetic that has only bounded quantifiers. Similarly, there is a 
primitive recursive function y so that if a set X is coded by a number c then, for all 
i, we have i € X if and only if y(i, c) = 1. This allows us to handle coded sets within 
weak systems of arithmetic, which is the motivation for the specific coding method 
in (5.3). 

When we work with nonstandard models of arithmetic, there is an important 
tension between sets that are finite, when viewed from outside the model, and sets 
that can be coded using numbers in the model. 


Definition 5.5.6. Suppose M is a model of RCAg and X € M. (We do not assume 
xyes”) 


1. X is M-bounded (or simply bounded) if there is an € M such that i < n for all 
ie xX, 

2. X is M-coded (or simply coded) if there is a c € M that codes X in the sense 
of Definition 5.5.5. 


In the standard model, the two notions coincide. The following proposition is 
essentially the Chinese remainder theorem. 


Theorem 5.5.7. Every bounded subset of w has a code, and every set that has a code 
is bounded. 


In arbitrary models, the relationship is more complicated. In a nonstandard model 
M of RCAo, there will always be subsets of M that are bounded but not coded in the 
model, such as the set of standard natural numbers. We will discuss this phenomenon 
in detail in Section 6.2. The next result, which amounts to a formalization of the 
preceding one, shows that this cannot happen for sets in S™. 


Theorem 5.5.8. RCAg proves that, if X is a bounded set, then X has a code. That is, 
if M & RCAg, X € S™ and X is M-bounded, then X is M-coded. 


Proof. Arguing in RCAg, say X is bounded by k € N. By primitive recursion, define 
m = k!. More precisely, define f: k + 1 — N as follows: let f(0) = 1, and fori < k 
let f(i+ 1) = fC) - @+ 1); then let m = f(k). By induction, 


i+ 1 divides m for alli < k. (5.4) 


We claim that for all j < i < k, m(j+1)+1 and m(i+1)+1 are relatively prime. 
Indeed, say d divides both m(j + 1) + 1 and m(i + 1) + 1. Then d also divides their 
difference, which is just m(i — j). Setting m(j + 1) + 1 = dq; and m(i — j) = dq, 
we see that m(j + 1) + 1 divides dg;q = m(i — j)d;. But m and m(j + 1) + 1 are 
relatively prime, so m(j + 1) + 1 must divide (i — j)d;. (Check that RCAo can prove 
this standard arithmetical fact!) On the other hand, 1 < i— j < k by hypothesis, so 


5.5 The subsystem RCAo 121 


i — j divides m by (5.4). Thus, m(j + 1) + 1 andi — 7 must be relatively prime, and 
the former must consequently divide d;. Since d; divides m(j + 1) + 1, it follows 
that d = 1, which proves the claim. 

Now by primitive recursion, define n = [];cx(m(i+ 1) + 1). More precisely, 
define g: k + 1 — Nas follows. Let g(0) = 1. Fori < k, let 


vexit= (mit 41) ifieX, 
g(i) ifi ¢ X. 
And let g(k) =n. 

Clearly, m(i + 1) + 1 divides n if i € X. An induction on € < k shows that any 
prime factor of g(¢) must be a factor of m(i + 1) + 1 for some i < ¢ in X. By the 
claim above, it follows that i € X if m(i + 1) + 1 divides n = g(k). 

We conclude that r = (k, (m,n)) represents X in the sense of Definition 5.5.5. The 
set R of all such representatives exists by pa comprehension, and hence is nonempty. 
Let v(x) be the formula (Vy < x)[y ¢ R]. Then we have ay(r), so by pa induction 
there is ac <r such that s=y(c) and either c = 0 or c > Oand y(c — 1). In any case, 
c is acode for X. oO 


The statement of the previous theorem illustrates an important convention in the 
literature. When we consider an arbitrary model M, there are two possible meanings 
for the word “set”: arbitrary subsets of M, or elements of S™. If we say that RCAo 
proves that all sets have a particular property ®, we mean that RCAo proves a sentence 
of the form (VX)®, and thus proves the property for all sets in S™. In contrast, the 
following theorem would often be stated in the literature as “RCAo proves that every 
coded set exists.” 


Proposition 5.5.9. If M is a model of RCAy and X © M is M-coded, then X ¢ S™. 


Proof. Suppose that c is a code for X. The property in (5.3) can be expressed with 
only bounded quantifiers, and therefore AY comprehension suffices to construct the 
set X. oO 


Combining the previous results gives a characterization of a particular class of 
sets in a model of RCAo. 


Definition 5.5.10. Suppose M satisfies RCAg. A set X C M is M-finite if it satisfies 
the following equivalent conditions: 


1. X is M-coded. 
2. X € S™ and X is M-bounded. 


When the model is clear from context, or if we are working within RCAo, we 
may refer to these sets as simply finite. However, a set may be M-finite but not finite 
in the standard sense. For example, if M is a nonstandard model and n € M is a 
nonstandard number, the set L, = {i : i < n} will be an M-bounded set in S™, 
and will thus be M-coded with some nonstandard c € M. But L,, will contain every 
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standard natural number, and will thus be infinite from an external viewpoint. This 
distinction between the sets that are finite from the perspective of a model, compared 
to sets that are finite from an external viewpoint, must be kept in mind when working 
with (possibly) nonstandard models, especially with regards to induction. We discuss 
this issue in more detail in Section 6.2. 

The comprehension and induction axioms in RCAo give us powerful tools to 
handle M-finite sets. One example of this is the following theorem. We have seen 
that RCAo does not include 5 comprehension scheme, which is equivalent to ACAo. 
But RCApo does include a version of xt comprehension applied to finite sets. 


Theorem 5.5.11 (Bounded zt comprehension). RCAg proves the following. Sup- 
pose y(n) isa x formula (which may have parameters), and X is finite set. Then 
the set {n € X : y(n)} exists. In particular, for all z the set {n < z: y(n)} exists. 


We delay the proof to the next chapter, where Theorem 6.2.6 will provide a general- 
ization. Just as bounded x comprehension is linked to x? induction, we will see that 
bounded comprehension for formulas higher in the arithmetical hierarchy is linked 
to stronger induction principles. 


5.5.3 Formalizing computability theory 


In Theorem 5.2.3, we saw that every computable function is a definable. The proof 
relied on the fact that each primitive recursive function is AY definable, and the way 
that the universal computable function & from Definition 2.4.2 is defined using the 
primitive recursive functions U and T. 

We can leverage these facts and their proofs to formalize computability theory 
within RCAo. As long as the natural =F and m1 formulas are used to represent a 
primitive recursive function in Lemma 5.2.2, RCAo will be able to prove that the 
primitive recursive function satisfies the recursion equations used to construct that 
function. For example, if f(x) = h(go(x), g1(x)) is formed by composition then, 
letting p and w be the formulas naturally used to prove Lemma 5.2.2, we will have 


RCAg F (Wn) (Vx) (Lp(a, 2) 1 = h(go(x), 81(x))] A [e(x, 2) @ W(x, n)]). 


The same holds for basic functions and for functions defined by primitive recursion: 
RCAg will be able to prove that the corresponding properties hold for the functions 
defined by the formulas in Lemma 5.2.2. This is due to the way that the lemma is 
proved by structural induction. 

This allows for a formalization of computability theory in RCAg. Once we have 
the natural representations of the primitive recursive functions U and T, we have a 
i definition of & that allows us to verify many properties of computable functions 
in RCAo, including the use principle, the key properties listed in Proposition 2.4.9, 
the S*” theorem, and the recursion theorem. 
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In this formalized computability theory, both the arguments to a computable 
function and the index for a computable function may be nonstandard numbers, in 
the context of a nonstandard model M. For example, if n is a nonstandard element 
of M, the constant function f(t) = 1 is computable (and primitive recursive) from 
the perspective of M, and hence has an index in M. That index is, necessarily, 
nonstandard. Similarly, if e is the usual index for the function ®,(f) = At.2t, then 
M will satisfy ®.(n) = 2n. 

In the previous paragraph, we referred to the “usual” index, that is, the one that 
would naturally be obtained from the equation shown. But there are infinitely many 
other indices for this function, some of which may behave differently in nonstandard 
models. In the context of formalized computability theory, it becomes even more 
important to distinguish between a computable function f and a particular index e 
such that f = ®.. There are examples of e, e’ € w as follows. 


¢ @, is total in the standard model of RCAg, but not total in some nonstandard 
models. For example, let ®.() converge if and only if n does not code a proof 
of 0 = 1 from the axioms of RCAo. This computable function is total in the 
standard model because there is no such coded proof, but is not total in a model 
of =Con(RCAg). 

¢ ®, and ®,, are the same function in the standard model, total in all models of 
RCAo, but not equal in some nonstandard models. Examples again follow from 
the incompleteness theorem. 


One property that does hold is x completeness: if the standard model of RCAg 
satisfies ®,(n) = m then so does each nonstandard model of RCAg. So a computable 
function on standard arguments cannot converge to different values in different 
nonstandard models, if the function converges to a value in the standard model. 

The study of formalized computability is part of reverse recursion theory. For 
example, we can ask which induction axioms are needed to verify a particular 
constructions from classical recursion theory. We will present some results of this 
sort in Section 6.4. In most context of reverse mathematics, however, we do not 
need to refer directly to formalized computability. Instead, we view RCApo itself as a 
different kind of formalization of computable mathematics. 


5.6 The subsystems ACAy and WKLo 


The system RCAg provides a convenient base system, because it is strong enough 
to formalize many of the routine coding methods from computable mathematics. At 
the same time, many well known mathematical theorems are not provable in RCAo. 
One goal of reverse mathematics is to identify the strength of these theorems. A 
remarkable phenomenon is that many theorems are provable in RCAo or equivalent 
to one of four stronger subsystems over it. These subsystems, shown in Figure 5.2, 
are themselves linearly ordered by provability. 
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Not all theorems are equivalent to one of the “big five” subsystems, of course. 
For those principles, the “big five” serve as useful reference points. For example, a 
theorem that is stronger that RCAg but weaker than WKLp is different from a theorem 
that is weaker than ACAg but neither implies or is implied by WKLo. 

In this section we will describe the systems ACAg and WKLo, which along with 
RCAp make up the bottom half of the “big five”. These systems are closely tied to 
Turing computability and subtrees of 2<®. 


5.6.1 The subsystem ACAg 


The initials ACA stand for “arithmetical comprehension axiom’. The system we 
define now, ACAg, is a strengthening of RCA obtained by expanding the compre- 
hension scheme to include all arithmetical formulas. As we will see, this is closely 
tied to the Turing jump operator. 


Definition 5.6.1. ACAg is the £5 theory that includes RCAg and the comprehension 
axiom scheme for all arithmetical formulas. 


That is, ACAg consists of PA”, comprehension for arithmetical formulas, and Ix? AS 
noted at the end of Section 5.3.2, we could replace bo by the set induction axiom 
or, alternatively, by the induction scheme for all arithmetical formulas (not just aa) 
We will formally separate ACAp from RCAg in Proposition 5.6.14. 

The next theorem gives a deeper result that comprehension for 2 formulas is 
enough to give comprehension for all arithmetical formulas. 


Theorem 5.6.2. ACAg is axiomatized by the theory x!-CAo consisting of PA’, the 
comprehension axiom scheme for x formulas, and the set induction axiom. 


Proof. By the preceding discussion, it is enough to show that if y(n) is an arith- 
metical formula then x!-CAg proves that the set {n : y(n)} exists. Without loss of 
generality we may assume that ¢ is in prenex normal form. 
The x? comprehension scheme includes the comprehension axiom for taking the 
complement of a set: 
(VX)(AY)[neX an¢€yY]. 


Thus, by taking complements, we may assume that y begins with a block of existential 
quantifiers, and is of the form y(n) = (Ax)w(n, x), possibly with parameters. 

We proceed by induction the level of y in the arithmetical hierarchy. The base 
case, for = formulas, follows by assumption. So, by induction, assume that we have 
comprehension for » formulas, and assume ¢ is = we 

To proceed by induction, we make use of the pairing function to form the set 
A = {(n,x) : aw(n, x, y)}. This set is defined by the bas formula 


(At) (An)(Ax)[t = (nx) A=W], 
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which is x. By induction, we may form the set A using repeated applications 
of x? comprehension. Then we may form our desired set {n : y(n)} using x? 
comprehension on the formula (Ax)[(n, x) € A], which is xf with parameter A. O 


Using the formalization of computability theory in RCApo, as discussed in Sec- 
tion 5.5.3, we get the following corollary to Theorem 5.6.2. 


Corollary 5.6.3. ACAg is equivalent to RCAg + (VX) [X’ exists]. 


So, in the parlance of Section 4.6, we see that the w-models of ACAo are precisely 
the jump ideals, i.e., the Turing ideals that are closed under the Turing jump. By 
Post’s theorem (Theorem 2.6.2), we also have the following example. 


Corollary 5.6.4. Let M be the w-model with a second order part consisting of 
exactly the arithmetical sets. Then M & ACAg. 


There is a subtle point of caution here. Obviously, another characterization of the 
w-models of ACAg is that they are those Turing ideals S such that for all X € S and 
alln € w, X € S. However, ACAg is not equivalent to RCAg + (Vn) [X” exists]. 
Indeed, this characterizes another system. 


Definition 5.6.5. ACA) is the £2 theory that includes RCAg and the axiom 
(VX)(Vn)[X exists]. 


The issue is that in a model, the quantifier on m here may potentially range over 
nonstandard numbers. And indeed, it is not difficult to see that there is no way to get 
around this obstacle. 


Proposition 5.6.6. ACAg does not prove (VX)(¥n)[X“ exists]. Therefore, ACA, is 
a Strictly stronger system than ACAo. 


Proof. Start with a nonstandard model M of PA’, and let S consist of all X C M 
that are arithmetically definable in M. Let M be the £2 structure with first order 
part M and S™ = S. Then this is a model of ACAo by Theorem 5.6.2. 

Consider @ € S and let n € M be nonstandard. For a contradiction, assume there 
is BE S™ so that M & B = @”. Choose k € w such that B is = definable over 
M. By a formalization of Post’s theorem, we have M & B <r @‘*). Because n is 
nonstandard, k + 1 <™ n. Hence, we have Mt Ot) <p @™ = B <p O). But 
ACApo proves okt) ¢+ @*), so we have a contradiction. oO 


By a clever compactness argument, even more can be said for I, statements 
(which in particular include many sentences of £2 corresponding to V5 theorems of 
interest in reverse mathematics). We have the following somewhat surprising result. 


Theorem 5.6.7 (see Wang [320]). Suppose that y is an L2-sentence of the form 
(VX)[6(X) — (AY)W(X,Y)], where 6 and w are arithmetical. If ACAg + y then 
there isa k € w such that ACAg + (VX)[0(X) > (AY <p X“ W(X, Y)]. 
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So a problem (whose formalization into £2 is) provable in ACAg must have the feature 
that not only does it admit arithmetical solutions, but for some (fixed!) k € a, it 
admits x solutions. An important example of a problem that enjoys the former 
property but not the latter is Ramsey’s theorem (see Chapter 8). 

The provenance of Theorem 5.6.7 is a bit hard to pin down. It seems to first 
show up in print in a 1981 book by Wang [320], who suggests it is folklore. (It 
appears to also be treated as such by Friedman in [110].) Wang gives two proofs, 
both unpublished and obtained by personal communication; one is due to Jockusch, 
and another, apparently earlier one, is due to Solovay. We give the proof by Jockusch. 


Proof (of Theorem 5.6.7; Jockusch, unpublished). Suppose ACAg + y. Let £5 be 
the language obtained from £2 by the addition of a new constant symbol, C, of the 
second sort. For each k € w, let €, = A(AY <p C)w(C,Y), and let T be the 
L/, theory ACAg + O(C) + {fx : k € w}. We claim that T is inconsistent. From this 
it follows that there is a k € w such that ACAg + @(C) > (AY <p C)w(C,Y). 
Since C is not in £3, this implies ACAg + (VX)[@(X) > (AY <p X)W(X,Y)], as 
desired. 

Seeking a contradiction, suppose T is consistent and fix M & T. Let N be the 
model with first order part M and second order part all subsets of M arithmetically 
definable in M. By Theorem 5.6.2, N & ACAo, hence N & y. Also, N is an w- 
submodel of M, so by Exercise 5.13.1, the two models satisfy the same arithmetical 
sentences. Since C € SY, it follows in particular that N t 6(C). There must thus 
exist some D € SW such that w(C, D) holds in N, and as SY ¢ S% and yw is 
arithmetical, this fact also holds in M. But D is arithmetically definable in M, 
hence there must be some (standard) k € w such that M & D <7 C). It follows 
that M & 7Z,, a contradiction. oO 


One more interesting strengthening of ACAg, which we will work with in Sec- 
tion 9.9, is the following. 


Definition 5.6.8. ACA; is the £5 theory that includes RCAg and the axiom 
(VX)(aY) (vn) [Yl = x aylel = cylly’. 


The w-models of ACA} are precisely those classes S which, for every set A they 
contain, contain also the w jump of A, meaning A‘®) = (A, A’, A”,...). 


Proposition 5.6.9. There is an w-model of ACAg (and hence of ACA) which is not 
closed under the map A ++ A‘). Hence, ACA\, is a strictly stronger system than 
ACA). 


Proof. Let S be the w-model consisting of all arithmetical subsets of w. Since @”) 
is not arithmetical, we are done. oO 
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5.6.2 The subsystem WKLo 


We will leverage the formal definition of finite sets to work with finite sequences 
within subsystems of arithmetic. This will allow us to formalize trees and state the 
defining axiom for the subsystem WKLo. 

We have been viewing finite sequences as (graphs of) functions on initial seg- 
ments of w. To avoid using set quantifiers in formulas that quantify over finite sets 
(and, indirectly, finite sequences), we will define the collection of codes for finite 
sequences. As these codes are simply numbers, we can then quantify over codes with 
number quantifiers. 


Definition 5.6.10 (Coding strings in RCA). The following definition is made in 
RCAo. A (code for a) finite sequence or string is a (code for a) pair (€, c), where c is 
a code for a finite set X such that: 


1. every element of X is of the form (i,) for some i < € andn €N, 
2. for every i < € there is a unique 1 € N such that (i,n) € X. 


We call € above the /ength of a, denoted |a|. If (i,m) € X we write a(i) =n. 


Definition 5.6.11 (Coding functions in RCA). The following definition is made 
in RCAo. Fix sets A,B C N. A set X is a (code for a) function from A to B if 
X = (A,B, F), where every element of F is of the form (n,m) for some n € A and 
m € B, and for every n € A there is a unique m with (n,m) € F. 


Remark 5.6.12 (Notation for functions ). It is common to deviate from our notational 
conventions and use lowercase letters near the middle of the alphabet (f, g, h, etc.) 
when discussing coded functions in RCAg. This more closely aligns with standard 
usage, and as a result, makes the notation more evocative. We also write simply 
f: A— Bor f € B4 instead of f = (A, B, F), and f(n) = m instead of (n,m) € F, 
etc. 


With these definitions and notational conventions, most other standard definitions 
formalize directly. This is the case, for instance, for the prefix relation <, which 
enables us to define when a finite sequence is an initial segment of another, or of a 
function f: N — N, etc. In RCAo, we can easily form the set of all finite sequences, 
as well as the set of all binary finite sequences, and following our discussion in 
Section 5.4, we denote these by N<“ and 2<, respectively. A subset of one of these 
is a tree if it is closed downward under <, as usual. An infinite path through a tree is 
defined in the standard way. 

Using our formalism, we can naturally write Konig’s lemma and weak Konig’s 
lemma as 1, sentences in £5. We will refer to these using the abbreviations KL and 
WKL, respectively, which we previously used in Chapter 3 to refer to the informal 
Va theorem versions of these principles and their corresponding problem forms 
(Definitions 3.2.1 and 3.2.2). 


Definition 5.6.13. WKLo is the £2 theory that includes RCAg and WKL. 


128 5 Second order arithmetic 


It follows immediately from definitions that an w-model M satisfies WKLo if and 
only if S™ satisfies the problem WKL discussed in Chapters 3 and 4. So by Propo- 
sition 4.6.3, we see that M & WKLo if and only if (the second order part of) it is a 
Scott set (as defined in Section 4.6). Equivalently, by Theorem 4.6.16, M & WKLo 
if and only if every X ¢ S™ there is a Y € S“ with X <« Y. In particular, WKLo 
has no topped w-models. These considerations show that WKLo lies strictly between 
RCAg and ACAog in strength. 


Proposition 5.6.14. 


I, ACAg t WKLo. 
2. There is an w-model of RCAg + =WKLo. 
3. There is an w-model of WKLo + =ACAo. 


Proof. Part (1) follows by Proposition 2.8.5. For (2), let M be REC, which is a not 
a Scott set. For (3), apply the low basis theorem (Theorem 2.8.18, using the remark 
after the proof of Theorem 2.8.25) to fix alow D > @. By Theorem 4.6.16 there is 
an w-model of WKL consisting entirely of D-computable sets. In particular, every 
set in this model is low. It follows that 9’ is not in this model, so by Corollary 5.6.3, 
ACAo does not hold. oO 


It is interesting to note the w-models of WKLo are structured differently from those 
of RCAp and ACA. For one, each of the latter two have minimum w-models, which 
are REC and the collection of arithmetical sets, respectively. By contrast, WKLo does 
not have even minimal w-models. Informally, this suggests that there is no set T' of 
formulas so that WKLo is equivalent to the set existence scheme restricted to T. 


Proposition 5.6.15 (Simpson [288]). Suppose M is an w-model of WKLo, with 
X € S™. There is an w-model M* of WKLo such that X ¢ S but S™ is a proper 
subset of S™, 


Proof. Since X ¢ S™, it follows by Proposition 4.6.3 that S” contains some 
D > X. By Proposition 2.8.26, there is a D* with D > D* > X. Let M* be 
the w-model consisting of all Y € 2° such that D* >> Y. By Propositions 2.8.26 
and 4.6.3, this model satisfies WKLo. We have X € S™ and S“’ c S™ since 
D*€S™, but D¢S™. g 


A related structural difference is that the w-models of WKLo are not closed under 
intersection. It is easy to see that this is the case for w-models of RCAp and ACAg 
(and indeed, their minimum w-models may be presented as the intersections of all 
their w-models). 


Proposition 5.6.16 (Simpson [288)). 


1. There exist w-models My and M, of WKLo such that the w-model M with 
S™ = SM 4 S does not satisfy WKLo. 
2. For every X €y @ there is an w-model of WKLo not containing X. 
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Proof. To prove part (1), fix Ao, Ar € 2° forming a minimal pair: that is, Ao, Ay <r 
® and for all Y, if Y <p Ao and Y <7 A, then Y <p @. Fix i < 2. By the cone 
avoidance basis theorem (Theorem 2.8.23) relativized to A; there exists D; > A; 
such that Aj4; <7 D;. Thus also Dj4; <7 D;. Let M; be the w-model consisting of 
all Y € 2 such that Y <p A; or A; <7 Y « D;. By Propositions 2.8.26 and 4.6.3, 
M; & WKLo. Now consider an arbitrary Y € SM 94 SM, By choice of Do, D; 
and the fact that Y <p Do, D1, it cannot be that Ag, Ay <7 Y. We also cannot have 
A; <r Y and Y <q Aj+1 for any 7, lest A; would be computable from A;+;. Thus, it 
must be that Y <p Ao, Aj, and hence that Y <y @. In order words, M = REC, which 
is not a Scott set and hence not a model of WKLo. 

To prove part (2), apply the cone avoidance basis theorem to find D >> @ with 
X £7 D. Then, let M be the w-model consisting of all Y < D. This is a model 
satisfying WKLo and X ¢ S™. Oo 


5.7 Equivalences between mathematical principles 


We are now ready to look at our first example of an equivalence between an arith- 
metical principle and a subsystem of second order arithmetic. Compared to the re- 
ducibility notions from Chapter 4, this gives a different way to compare the strengths 
of mathematical problems. 

In particular, we represent the problems as subsystems S and T of second order 
arithmetic. We ask whether RCAp + S proves every axiom of T. If so, we see that T is 
weaker than S “relative to” RCAo. If it is also true that RCA + T proves every axiom 
of S, then S and T are equivalent over RCAp, and otherwise S is strictly stronger. 
In Section 5.11, we compare this new reducibility to those from Chapter 4 in more 
detail. 


Definition 5.7.1 (Ranges in RCAg). The following definition is made in RCAo. Fix 
sets A,B C Nand f: A — B. A set Y is the range of a function X if (Vm)[m € 
Y © (An)[(n, m) € X]. 


Note that as defined in Definition 5.6.11, each function comes together with its 
domain and codomain, and as such they both always exist. By contrast, ranges need 
not always exist, as the next theorem shows. 


Theorem 5.7.2. The following statements are equivalent over RCAg. 


I, ACAo. 
2. The range of every function from N to N exists. 


Proof. First, we assume ACAg and prove that every function has a range. This follows 
immediately from the fact that the range of a function is defined by an arithmetical 
formula with the function as a parameter. 

Second, we assume the axioms of RCAg and also assume that every function has 
a range. We wish to prove that each axiom of ACAg must hold. By Theorem 5.6.2 
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it is enough to show that each x? comprehension axiom holds. So let y(n) be a x? 
formula. We may assume g is of the form (Am)w(m,n) where w is =i We may 
also assume there is at least one no such that y(no), because comprehension for ¢ is 
trivial otherwise. 

We first form the set X consisting of all pairs ((m,n), k) such that 


¢ w(m,n) holds and k = n, or 
* w(m,n) does not hold and k = no. 


This set can be formed by A° comprehension. It follows from the definition that X is 
a function, because each z can be written in exactly one way in the form (n,m). 
Now suppose that Y is the range of X, and let n # no be fixed. If n is in the range 
of X then, by construction, there is some m such that y(m, n) holds, so y(n) holds. 
Conversely, if y(7) holds then there is some m such that w(m, n) holds, which means 
X((m,n)) =n, son € Y. Thus Y is the set {n : y(n)}, which is what we wanted to 
form. oO 


The right way to think of this theorem is alongside Corollary 5.6.3. If we can prove 
that the range of any function exists, we can in particular prove it for the function that 
enumerates the halting set (relative to a given oracle), and conversely. In practice, 
this shows up as follows. To prove that RCAg + P — ACAo for some 1, statement 
P (formalizing an V5 theorem), we begin by showing that P codes the jump. Then, 
formalizing the argument in RCApg typically yields a proof that the range of every 
function from N to N exists. This formalized argument is made easier by referring 
to ranges rather than formalized Turing jumps. 

The word “range” in connection with a function is also sometimes used to refer to 
the formula defining the range. For example, we may say “the range of f is disjoint 
from the set Z” as shorthand for 


(Vy) (An) [f(n) =m > m ¢ Z}. 


Here again we will rely on usage of the word “exists”, as discussed above, to make 
our meaning clear. 

The proof of Theorem 5.7.2 has several aspects that are common to reverse math- 
ematics. First, it shows an equivalence between a “mathematical” principle (that the 
range of a function exists) and a “logical” principle (the arithmetical comprehen- 
sion scheme). The second part of the proof, which shows that if every function has a 
range then the arithmetical comprehension scheme is valid, is the reversal. The proof 
demonstrates two typical features of a reversal to ACAg. First, rather than directly 
establishing that every instance of arithmetical comprehension holds, the reversal 
makes use of a characterization of ACAp given by Theorem 5.6.2. Even with this 
simplification, the proof must still show than an entire axiom scheme holds. The 
scheme is enumerated in the metatheory, rather than the object theory. Thus we do 
not attempt to prove in RCAog that “if y is arithmetical then {n : y(n)} exists” under 
the assumption that every function has a range. Instead we verify that, for each ¢, 
RCAg is able to show that “‘{n : y(m)} exists” under the same assumption. 
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We will see that WKLo, too, admits a characterization in terms of ranges of 
functions, in the form of the x separation principle (Exercise 5.13.18). 

KG6nig’s lemma gives another example of an equivalence between a mathematical 
theorem and a subsystem of second order arithmetic. 


Theorem 5.7.3. The following are equivalent over RCAo. 


I, ACAo. 
2. KL. 
3. KL restricted to trees in which each node has at most two immediate successors. 


Proof. We have essentially already proved this, in the guise of Propositions 3.6.10 
and 4.2.9. We thus refer to these proofs, and indicate merely how to formalize them 
in RCAo. This gives us a first taste of a common practice in reverse mathematics: do 
the computability theoretic argument first, formalize second. A couple of technical 
steps in the formalization require facts we have not established yet; we point these 
out below with forward references the theorems that fill these in. 


(1) — (2): We argue in ACA. Let T ¢ N<N be an infinite, finitely branching tree. 
Recall the definition of the function p in the proof of Proposition 4.2.9. This function 
is arithmetically definable, and so exists. However, we must prove that it actually is a 
function, i.e., that it is total. (In the original proof, we relied on KL for this, but here 
of course that is not available to us since that is what we are proving.) 

Fix a € T. If there is ani such that {8 € T : B = ai} is infinite then there 
is a least such i and p(a@) = i. (The existence of this least i requires justification. 
See Proposition 6.1.5.) So suppose {8 € T : B = ai} is finite for all 7. Since T is 
finitely branching, we can form the finite set F of all i such that ai € T. For each 
i € F, we can also form S; = {8 € T : B = at}, which is a finite set by assumption. 
Then S$ = U,er S; is a finite union of finite sets, and hence is itself finite. (This 
is nontrivial because we are dealing with “finite” sets in potentially nonstandard 
models. It is generally false in RCAo, as discussed in detail in Section 6.5. A proof 
is given in Proposition 6.5.4 which shows, in particular, that this conclusion goes 
through in ACAg.) Hence, a has only finitely many successors in T, and therefore 
p(a) = 0. With p total, the rest of the proof of Proposition 4.2.9 formalizes easily. 

The implication (2) — (3) is obvious. To prove (3) — (1), we now argue in 
RCApo and formalize the proof of Proposition 3.6.10. By Theorem 5.7.2, it suffices 
to prove that the range of every injective function f: N — N exists. So fix such an 
f. Let T consist of all a € N<% satisfying the following conditions for all y < |a]. 


1. Ifa(y) =0 then f(x) # y forall x < |a]. 
2. Ifa(y) =x +1 then f(x) = y. 


Then T exists by A? comprehension, and it is easily seen to be a tree. Moreover, 
using the fact that f is injective, it follows that each a € T has at most two immediate 
successors. Thus, by (3) we may fix an infinite path g € [T]. Now it is easy to see 
that y € range(f) if and only if g(y) > 0, so the range exists. Oo 
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5.8 The subsystems IT}-CAg and ATRo 


We now turn to the strongest two subsystems of the “big five”: II} -CAo and ATRo. 
While ACAy and WKLo are closely related to classical computability theory, and 
subtrees of 2“, the stronger subsystems are closely related to hyperarithmetical 
computability and to subtrees of w<®. 


5.8.1 The subsystem II} -CAo 


The subsystem TI;-CAo is simple to define via the ordinary comprehension scheme. 


Definition 5.8.1. TI;-CAo is the subsystem consisting of RCAg together with the 
comprehension scheme for II; formulas. 


Following the usual pattern discussed in Section 5.3.2, TI; -CAo proves the induction 
scheme for 186 formulas and for xt formulas as well. The starting point of our 
discussion of I; -CAo is the following normal form theorem for = formulas. 


Theorem 5.8.2 (Kleene’s normal form theorem for xt formulas). For each ph 
formula y(X), there is a =e formula @(x, y) such that 
ACAg + (VX) [p(X) © (Af E N)(WA)[O(X TK, f PADI. 


Proof. For ease of notation, we treat y as having no set variables, and show that 
there is a =p formula 6(x) such that 


ACAgt g & (Af € NY)(Vk)[A(f TA)]. 


The proof is the same in the more general case. First, we exhibit a i formula w(X, j) 
such that 
ACAn +t 9 @ (Af € NY) (Vx)W(f, x). 


Then W(f,/) is equivalent to (V£)0*(f | €, 7) for some ps formula 6* by for- 
malizing Proposition 2.8.7. Now let 6(c) be the formula (4i,7 < |o|)[|o| = 
(i, J) A O*(o fi, j)]. Then 6 is x and, as desired, 
ACAg ty @ (Af EN')(VAW(F. A) 
© (Af EN) (VA) (VOR (FTE) 
© (Af €N")(Vk)O(f Tk). 
To define w, write y as 


(AX) (Vx0) (Ayo) (Wx1)(Ay1) «++ (¥xn) (Ayn) (5.5) 
A(X, x0, YO.*1,Y1,--- Xn, Yn) 
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where @ is quantifier free. We claim that in ACA, this is equivalent to 


(AX) (Ago,---.8n € NN) (Vx0, Xn) (5.6) 
6(X, x0, 8o(X0). X15 81(X0.%1)> +++ >Xns Bn(X0> +++ s¥n)) 


where g;(x0,---,Xn) 1S shorthand for gi({xo,-.-,Xn)). That (5.6) implies (5.5) is 
clear. In the other direction, fix X witnessing (5.5). For each i < n, define g; € NN 
inductively as follows: for all xo, ...,;, gi(xo,.-.,;) is the least y; such that 


(Wxi+1) (Ayi+1) «++ (Wxn-1)(Ayn-1) 


0(X, x0, 80(X0),X1, 81(X0,%1), +++ 5Xj-15 8i-1(%, toe »Xi-1)sXis Yiro-++>Xn> Yn): 


To be precise, the induction is external since vis fixed (and standard), and arithmetical 
comprehension is used to prove that each g; exists. Now by induction on i, we obtain 
the matrix of (5.6). 7 

Say the number of variables in X in 6 is m. Define w(f, x) to be the formula 


f = (Xo, .- Xm-15 800 ++ +9 Bn) AX = (XO, ++ +5 Xn) 
— (Vyo,..-,¥n)(VE < n)[{{x0,.-- Xi), Yi) © & 
— (Xo, .. +, Xm—15X0, VO. X1s Vis +++ Xns Yn) I- 


Note that w is m1’. Here, we are thinking of X;(n) and g;(n) as f(i,n), soif f ¢ NN 
then so are all the X; and g;. Then it is not difficult to see that (5.6) is equivalent to 
(Af ¢ NY)(Vx)W(f, x), as wanted. o 


Kleene’s normal form theorem exposes an important connection between = 
formulas and well founded relations, as we will see in Corollary 5.8.4. In Lo, we 
formally think of a partial ordering as a (code for a) pair of sets (X, <x), such that 
X C N and <x is a set of (codes for) pairs of elements of X satisfying the axioms 
of a partial order. As usual, for n,m € X we writen <x mifn <x mandn #m. 
We also typically abbreviate (X, <x) simply by X. It will be convenient to isolate 
the following definitions. 


Definition 5.8.3. 


1. If X is a partial ordering, an infinite descending sequence through X is a function 
f: N- X such that f(k +1) <y f(k) forall k EN. 

2. WF(X) is the formula asserting that X is a partial ordering and there exists no 
infinite descending sequence through X. 

3. WO(X) is the formula asserting that X is a linear ordering and there exists no 
infinite descending sequence through X. 


As usual, WF(X) we say X is well founded. If T C N<N is a tree, we write WF(T) 
to mean WF((T, =)), i.e., T has no infinite path. 

We now have the following consequence of Theorem 5.8.2. Essentially, it is a 
restatement, but it is very convenient in connection with TIj-CAo. 
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Corollary 5.8.4. For each tt formula p(X) of £2, ACAo proves that there is a tree 
T CNN, uniformly x4 definable from ¢, as follows. 


1. (VX)[9(X) @ (Af eN")(X@ fe [T])]. 
2. (AX) g(X) @ a WF(T). 


Proof. Part (2) is an immediate consequence of part (1), so we prove the latter. Let w 
be the =, formula given by Theorem 5.8.2, and define T to be the set of all sequences 
((o }k,a tk): k < t) where o € 2', a € N’, and (Vk < t)hW(o [k,a | k). Then T 
is exists, and clearly it is a tree. We have 


y(X) & (Af Ee N)(VAW(X Tt, f P) 
© (Af e NY)(WA (Vk < W(X PK TK) 
© (Af EN) (WA[UX TK f tk) ik <0 €T] 
© (Af eNY)[X of € [T]]. 


This completes the proof. oO 


There is a point of subtlety in Definition 5.8.3 in the context of second order 
arithmetic because it is relatively easy for a model M to contain linear orderings that 
are not well founded when viewed externally, but which have no infinite descending 
sequences in M. For example, a key fact in hyperarithmetical theory is the existence 
of computable pseudowellorderings, which are computable linear orderings of w 
that are not well founded but have no hyperarithmetical (i.e., At) infinite descending 
sequences. If S is a pseudowellordering, the jump ideal below S is an w-model of 
ACAo that believes S is a well ordering. 


Definition 5.8.5. The hyperjump of a set X is the set of numbers e such that ©* is 
the characteristic function of a (total) well ordering of w. 


The hyperjump operation plays an analogous role in hyperarithmetical theory to 
the Turing jump in classical computability. In particular, the hyperjump of a set X 
is He complete, analogous to the fact that the Turing jump of X is = complete. 
This fact can be formalized to obtain the following equivalence. 


Theorem 5.8.6. TI; -CAg is equivalent over RCAg to the principle that the hyperjump 
of every set exists. 


To prove the theorem, we recall the definition of the Kleene—Brouwer order on finite 
sequences. 


Definition 5.8.7. The Kleene-Brouwer ordering <xpg is a linear ordering of w<° 
defined as follows: for a, 8 € w<®, we have B <xgz a if one of the following hold. 


1.B>a; 
2. There is an n such that t [2 = a fn and B(n) < a(n). 


Given a tree T C w<®, KB(T) denotes the partial order (T, <p). 
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It is easy to see that KB(T) is uniformly computable from T. The notion easily 
formalizes in RCAo, and we then have the following important observation, whose 
proof is left to Exercise 5.13.21. 


Proposition 5.8.8. ACAg proves that for all trees T C N<N, 
WF(T) © WO(KB(7)). 


We turn to proving the theorem. 


Proof (of Theorem 5.8.6). Using the formalization of computability theory de- 
scribed in Section 5.5.3, we can write down a TI; formula 6(X,e) stating that 
®.(X) is the characteristic function of a well ordering of w. 

First, assume TI}-CAo and fix any set X. Then {e € N : 6(X,e)} exists by 1; 
comprehension, and this is by definition the hyperjump of X. 

In the other direction, we argue in RCAg. Assume that {e € N : 6(X, e)} exists 
for every X. First, we deduce ACAo. To this end, fix an injective function f: N > N. 
We will show that the range of f exists. By Theorem 5.7.2, this suffices. Using a 
formalization of the S?” theorem (Theorem 2.4.10) in RCAo, we can fix a function 
s: N > Nas follows: for all y,n € N, of ((n, n)) = 1, and for all m # n, 


f 1-y<(n,m) if s(ax < max{n, m})[f(x) = y], 
®: , ((n,m)) = 

s(y) X<(n,m) otherwise. 
Here, v< refers to the characteristic function of < (as a set of ordered pairs). Thus, 
of (6) orders n and m oppositely to their natural ordering if no number below 
max{n,m} witnesses that y is in the range of f, and otherwise it orders them the 
same as their natural ordering. This means that if y is not in the range of f then o! (y) 
is the characteristic function of w* = (w, 2), which is of course not a well ordering. 


But if y is in the range, say with f(x) = y, then Os is the linear ordering <x with 


f(x-1) <x f(x- I -1 <x +++ <1 <x 0 <y f(x) <x fx) tl <x:::, 


which is just isomorphic to (w, <). So y € range(f) © s(y) € fe EN: O(f,e)}, 
and thus range( f) exists. 

We can now establish TI; comprehension. Let y(y) be any TI; formula of £5. 
Since ACAo holds, we may appeal to Corollary 5.8.4 (and its uniformity) to find a 
family T = (Ty : y € N) of trees T, C N< such that y(y) @ WF(7,). Using again 
the 5S” theorem, let s: N — N be such that @! | is the characteristic function of 


s(y) 
KB(T7,) for every y. Then by Proposition 5.8.8, we have 


y(y) @ WE(T,) @ WO(KB(T,)) © s(y) € {fe EN: A(T, e)}. 


Hence, {y : y(y)} exists by pa comprehension. Oo 


It follows that the w-models of II}-CAo are exactly the Turing ideals that are 
closed under hyperjump. 
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The next characterization of 1A -CAo is useful in practice because it avoids refer- 


ring directly to well orderings, while also giving a sentence equivalent to TI;-CAo 
over RCAo. 


Theorem 5.8.9. TI; -CAo is equivalent over RCAg to the principle that, given a se- 
quence {T; : i € N) of subtrees of N<“, there is a set X so that, for alli, i € X if and 
only if T; has an infinite path. 


We leave the proof to Exercise 5.13.22. The key to the reversal is again first to deduce 
ACAo, which is a common and helpful step, and then to use Kleene’s normal form 
theorem. 

We will locate TI} -CAo alongside RCAg, WKLo, and ACAg in the next section, 
after we have introduced the last major subsystem, ATRo. 


5.8.2 The subsystem ATRo 


We now turn to the subsystem ATRo. Like WKLo, this subsystem is not given by 
a restriction of the comprehension scheme, but by a separate kind of set existence 
principle. There are many more analogies between WKLo and ATRo, in fact, although 
the mathematical principles equivalent to the subsystems are quite different. 

The formal definition of ATRo refers to well orderings of (nonempty) sets of 
natural numbers. In what follows, therefore, all well orderings are assumed to have 
field a nonempty subset of w. For ease of notation, if L is a well ordering we use L 
also for its field, and we use < ;, for the ordering relation. (Thus, L as a well ordering 
refers to the pair (L, <,), coded as a subset of w.) We use <z, for the strict version 
of <;, and we write 0; for the <;-least element of L. 


Definition 5.8.10. Let a well ordering L and an operator J: 2° — 2® be given. 
For each X € 2”, we define I‘) (X) to be the unique Y € 2 with the following 
inductively defined properties. 


1. ylcl= x, 
2. Ifx € Lis the <,-successor of y € L then yi) = yb), 
3. Ifx € Lisa <,-limit then 


yl ={(y,n):y<pxAne yb], 


As an example, let J be the Turing jump operator: /(X) = X’. (This is exactly 
the problem TJ, when viewed as a function.) If we identify each n € w with the well 
ordering (n, <) then we obviously have /‘”) (X) = X”, the nth jump of X. And if 
we identify w with (w, <) then J‘) (X) = X‘®) = Drew X” , But now, of course, 
we can also define X‘) = J‘)(X) for other well orderings, for example of order 
type w+ 1 orw+vw, etc. 

By analogy with Post’s theorem, the Turing jump is universal in the following 
sense for iterations of arithmetical functionals. 
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Theorem 5.8.11. Suppose X and Y are sets. The following are equivalent. 


1. X is computable from I (Y) for some 1: 2° — 2 arithmetical in Y and some 
well ordering L computable from Y. 
2. X is computable from Y for some well ordering L computable from Y. 


The field of hyperarithmetical theory begins with the study of which sets can 
be computed from these iterated Turing jumps of a given set, much as classical 
computability theory begins with the study of which sets are Turing reducible to a 
given set. 


Definition 5.8.12. A set X is hyperarithmetical ina set Y if X <p Y“ for some well 
ordering L computable from Y. If X is hyperarithmetical in Y we write X <yyp Y. 


The relation <yyp of hyperarithmetical reducibility is reflexive and transitive, so 
it induces an equivalence relation =}yp and a degree structure known as the hyperde- 
grees. There are many analogies between Turing reducibility and hyperarithmetical 
reducibility. In particular, just as Turing reducibility can be characterized by Ao 
definability, hyperarithmetical reducibility can be characterized by At definability. 


Theorem 5.8.13 (Kleene; see [264], Chapter II). A set X is hyperarithmetical in 
a set Y if and only if X is At definable relative to Y. 


The following subsystem of Z, is a natural weakening of TI; -CAo that can thus be con- 
sidered a formal analogue of hyperarithmetical theory, and serves as an intermediate 
step in our definition of ATRo. 


Definition 5.8.14. Aj-CAo is the subsystem consisting of RCAg together with the 
scheme of Ay comprehension: the universal closure of 


(Vn)[ p(n) > (n)] > (AX)[n € X & g(n)], 
where y and w range over xi formulas. 


The collection of all hyperarithmetical sets forms an w-model HYP that, by The- 
orem 5.8.13, is the C-minimum such model satisfying Aj-CAo. Obviously, every 
arithmetical set is hyperarithmetical. But Aj-CAo is strictly stronger than ACAo, and 
in fact strictly stronger than ACA}. 


Proposition 5.8.15. 
I. Aj-CAg F ACAS. 
2. There is an w-model of ACA + =Aj-CAo. 


Proof. For (1), it is clear that Aj-CAo + ACApo since every arithmetical formula is 
trivially iA; Now fix X. We need to show that there is a Y such that (Vn)[Y!! = 
XAylntll = (yl1)’], Let g((i,n)) and W((i, n)) be the formulas 


(az)[z =x avy <d[ZYt = 2417 ane zl], 
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and 
(VZ)[(Z = XA (WF < D[ZUt = Z4"]) one ZY, 


respectively. By arithmetical induction on i, we have 


(Vn) lp(i,n)) > W(di,n))]. 


Hence, by Aj-CAo, the set Y = {({i,n) : y({i,n))} exists, and clearly this is the 
desired set. 

For (2), let M be the w-model consisting of all Y <p o(@”) for some n € w. 
Given any X ¢ S™, say computable from ‘@”), we have that 
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Hence, M & ACA}. Now, let L be any computable well ordering of order type w®. 
Then L belongs to M but 2 does not because @'/) >7 @'@”) for all n. But a 
is hyperarithmetical, so M # Aj-CAo. Oo 


The subsystem ATRo is also inspired by hyperarithmetical theory, but features 
closure properties that Ai -CApo lacks. This is because of the existence of hyperarith- 
metical pseudowellorderings, discussed in the preceding section. If L is such an 
ordering then HYP & WO(L) but HYP # (SY)[Y = @]. ATRo is basically de- 
signed to overcome this problem: informally, it states that we can iterate an arbitrary 
arithmetical functional along an arbitrary well ordering. In applications, this turns to 
be more flexible and more useful than being closed under AY comprehension alone, 
and accounts for ATRo being more ubiquitous system. (Hence the reason ATRo makes 
the list of “big five” subsystems, while Aj-CAo does not.) 

To begin, we can formalize Definition 5.8.10 in second order arithmetic for 
arithmetical operators. Formally, consider an arithmetical formula (i, X) of Lo 
(which may include parameters). We think of this as the operator Jy such that 
Iy(X) = {i Wi, X)}- 

Definition 5.8.16. Let w(i, X) be an arithmetical formula. The following definition 
is made in RCAg. Let X,Y, L be sets. If WO(L), then we say Y = 1) if the 
following hold. 


1. yll=x, 


2. Ifx € Lis the <,-successor of y € L theni € yl if and only If W(i, yly]), 
3. Ifx € Lisa <,-limit then 


yl ={(y,n):y<pxAne yb]. 


Definition 5.8.17. The subsystem ATRo consists of RCAg and the axiom scheme of 
arithmetical transfinite recursion, which states the following for each arithmetical 
formula w(i, X): for all sets X and L, if WO(L) then there exists a Y such that 
Yar): 
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It is not difficult to formalize Theorem 5.8.11 to obtain the following alternate 
characterization of ATRo, which is analogous to the jump characterization of ACAg 
(Corollary 5.6.3). 


Theorem 5.8.18. The following statements are equivalent over RCAo. 


1. ATRo. 
2. For all sets X and L, if WO(L) then there is a set Y such that Y = X“, 


Here, X‘) abbreviates ad (X) for the formula w defining X’. 


There is an important distinction between arithmetical transfinite recursion and 
arithmetical transfinite induction. The latter is provable in ACAg (Exercise 5.13.19). 
In particular, ACA can prove that if there is a Y such that Y = ie (X) then it is 


unique. However, ATRo is strictly stronger than ACAo, and in fact, than Aj-CAo, as 
we will see in Proposition 5.8.21. 

The definition of ATRo helps to explain its name, but it is not always convenient for 
reversals. We now state an equivalence that is much easier to work with in practice. 


Definition 5.8.19. Let [ be a class of formulas of £5 in one free variable. The 
I’ separation principle is the following scheme, for every pair y(x) and w(x) in T: 
if (Vx)[y(x) — -=W(x)], there exists a set Z (called a separating set) such that 
(Vx)[p(x) > x EZ AW(x) Ox EZ]. 


Informally, this says that if X and Y are disjoint I’ definable sets (which may not 
exist), then there exists a set Z such that X C Zand YNZ=@. 

Observe that if Z satisfies (Vx)[y(x4) ~ x € ZA W(x) > x € Z] thenw\ Z 
satisfies (Vx)[W(x) — x € Z A v(x) > x ¢ Z]. So, at least over RCAo, the order of 
y and w does not matter. Also, note that if Cis a collection of formulas and —I is 
the collection {=y : y € T’} then the I separation principle is equivalent over RCAo 
to the -I separation principle. 

Our interest is in = =i The Turing jump of a set X is arithmetical in X, and so 
is trivially a definable from X. This means that the x separation principle implies 
ACAo over RCAo. The next theorem is much less trivial. It provides a useful starting 
point for reversals to ATRo. 


Theorem 5.8.20 (Simpson [288]). The following are equivalent over RCAo. 
1. ATR. 
2. The xt separation principle. 


We will prove this theorem in Chapter 12. Meanwhile, Exercise 5.13.18 shows 
that WKLo is equivalent to the x? separation principle. This draws another analogy 
between WKL g and ATRo. For now, we have the following consequence. 


Proposition 5.8.21. 


I, ATRo + Aj-CAo. 
2. There is an w-model of Aj-CAo + aAATRo. 
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Proof. For (1), we argue in ATRo. Suppose ¢ and w are xt and that (Vx)[y(x) © 
a(x)]. Then by Theorem 5.8.20 there is a separating set Z for y and yw, and clearly 
(Vx)[x € Z © y(x)]. For (2), consider the model HYP. As we remarked above, this 
satisfies Aj-CAo but not ATRo. oO 


Let us now look at ATRo alongside TI;-CAo. That TI;-CAo implies ATRo is an 
easy consequence of the preceding theorem. The separation is analogous to Propo- 
sition 5.6.14 (2), which saw WKLo separated from ACAg, only using higher com- 
putability theory analogues. In particular, instead of the low basis theorem for nm 
classes, we will need the Gandy basis theorem for x classes. This states the fol- 
lowing: if C is a nonempty I definable subset of 2° then there exists X € C 
that is hyperarithmetically low, meaning the hyperjump of X is hyperarithmetically 
reducible to the hyperjump of @. (For a proof, see Sacks [264], Corollary HI.1.5). 


Theorem 5.8.22. 


I. TI;-CAo +t ATRo. 
2. There is an w-model of ATRo + =I} -CAo. 


Proof. For (1), we show that TI}-CAo implies the x separation principle, which 
suffices by Theorem 5.8.20. T1}-CAo proves that every pe definable subset of N 
exists. Hence, it can serve as its own separating set (with respect to any other xt 
formula defining a disjoint set). 

For (2), we build an w-model by dovetailing and iterating, as in Theorem 4.6.13. 
Let 60(x, X), 0; (x, X), ... be an enumeration of all I formulas in the displayed free 
variables and no other set parameters. Set Ag = @, and assume that for some s we 
have defined A,. Say s = (e,t,i, /). If oo” is not an element of 2, let As4; = As. 
Otherwise, let C be the class of all X € 2 such that for all x, if 6;(x, 04’) holds 
then x € X, and if 0;(x, 02") holds then x ¢ X. Clearly, C is nonempty since it 
contains {x : 6;(x, 02')}. By the Gandy basis theorem relative to As, there exists 
Z €C which is hyperarithmetically low relative to A,. Let As,; = As ® Z. 

We end up with Ag <r A, <7 ---.Let M be the w-model consisting of all Y € 2° 
such that Y <r; As for some s. By induction, each As; is hyperarithmetically low 
over A, and hence hyperarithmetically low. Thus, the hyperjump of @ does not 
belong to S M and therefore M ¥ T1}-CAo by Theorem 5.8.6. On the other hand, we 
claim that M satisfies the TI; separation principle, and therefore ATRo. Indeed, let 
y(x) and w(x) be I; formulas with parameters from M. Let Z be the join of all set 
parameters occurring in y and y. Then Z ¢ S™, so we may fix a ¢ and e such that 
Z = 2". We may also fix i and j so that y(x) = 6;(x, Z) and (x) = 6;(x, Z). Let 
s = (e,t,i, 7). Then by construction, As; = As ® Z for some separating set Z for 
y and w. Hence, M satisfies the TI; separation scheme, which is equivalent to ATRo 
by Theorem 5.8.20. oO 
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5.9 Conservation results 


The previous sections gave examples of reverse mathematics results obtained from 
direct reversals. Although these are often the most interesting results, they can 
sometimes be difficult to obtain. A second class of results shows that two principles 
have the same consequences at some level of the arithmetical or analytical hierarchy, 
without showing that the two principles are equivalent. 


Definition 5.9.1 (Conservativity of theories). Let [be a set of £2 sentences. A £2 
theory T; is T conservative over a £2 theory T> if T, + y whenever T; + ¢ for all 
gel. 


A conservation result shows that one principle is conservative over another for 
some class I’ of formulas. Conservation results can be obtained using proof theoretic 
methods, which involve directly analyzing the structure of formal proofs. They can 
also be proved model theoretically, using the following definition. 


Definition 5.9.2 (Conservativity of structures). Let I be a set of £2 sentences, 
and let N < M be structures. We say that M is I conservative over N if, for every 
ygel,if Ne ythen Me gy. 


The following theorem gives two initial results on conservation in submodels. 
Recall that a model N is an w-submodel of a structure M if M differs from N only 
in having additional sets. 


Theorem 5.9.3. Assume that N is an w-submodel of M and ¢ is a sentence in 
L£2(N). 


1. If gis Tl, and Me g then N & ¢. 
2. If pis X} and Nk ¢ then MF . 


Proof. The key point is that for any arithmetical formula in L2(N), N_F w if and 
only if M & Ww (Exercise 5.13.1). For part (1), assume y = (VX) W(X) where w 
is arithmetical. Let C be any sequence of sets from S” of the same length as x. 
Because these sets are in M, we have M w(C ), and because w is arithmetical this 
implies N § w(C). Thus NV § (WX)W(X). 

For part (2), assume y = (AX) W(X) where y is arithmetical. Because N F 9, 
there is some C in S% with N E w(C). Because M is an w-extension of N, the C 
are in S™ as well, and M & wW(C). Thus M & (AX)W(X), as desired. o 


Not all conservation theorems are this straightforward. We will see below that 
the system WKLo is conservative over RCAg for Ty sentences. It is also conservative 
over primitive recursive arithmetic (PRA) for 1 sentences. Both PRA and RCApo are 
strictly weaker than WKLo. These conservation results require more than the formal 
manipulation of Theorem 5.9.3. These conservation results can be used to show that 
certain uses of WKLo are dispensable, and the certain implications are impossible. 
For example, the conservation results just mentioned imply that if WKLo is able to 
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prove the consistency of a theory (which is expressed as a m1 sentence) then this is 
already provable in PRA. Hence the consistency strength of WKLo is quite low, even 
though it is able to prove many sets exist that RCAg cannot prove to exist. Conversely, 
we know that no I; sentence can be equivalent to WKLp over RCAo, because if the 
sentence is provable in WKLo then it is already provable in RCAg. 

A particularly striking application of this method is given by Kikuchi and 
Tanaka [179], who show that Gédel’s second incompleteness theorem is provable 
in PRA by formalizing a model theoretic proof in the system WKLo, which is i 
conservative over PRA. 

The next theorem establishes a useful method for proving 1; conservativity. 


Theorem 5.9.4. Suppose that T, and Ty are £ theories such that every countable 
N § T, is an w-submodel of some M § To. Then T is 1; conservative over T\. 


Proof. We proceed by contraposition, and assume that 7; does not prove a I; 
sentence y. By the completeness theorem, there is a model N of 7; such that 
N § 79. Now 79 is equivalent to a py sentence p, so N § p and thus, by the 
previous theorem M & p. This means that M & 7g. o 


The theorem has an important corollary in the special but important case where 
T; is RCAp and 7) is RCAp together with a IL, sentence. This serves as a template 
for many conservation results in reverse mathematics. 


Definition 5.9.5 (Model extension). Fix an £5 structure M and aG C M (which 
need not be in S™). Then M[G] is the model with first order part M and second 
order part consisting of all A C M that are Nt definable with parameters from 


MUS U{G}. 


Notice that for every G C M, M is a w-submodel of M[G]. Furthermore, M[G] 
is automatically closed under A® comprehension. 

By Exercise 5.13.1, M and M[G] satisfy the same arithmetical formulas with 
parameters from M. There are, however, many arithmetical formulas involving the 
set G, and we cannot conclude anything about these on general principles. For 
example, it may be that M satisfies Ix? but that M[G] does not. We would only 
know that induction in M[G] for those xt formulas that do not include G as a 
parameter. In practice, this means that if we start with a model M of RCAo, and want 
to add a G so that M[G] is also a model of RCAo, then we need to separately verify 
that IZ° holds in M[G]. 

In cases where we can do this verification, we get the following conservation 
result. 


Corollary 5.9.6. Let P be an Ly sentence of the form 
(VX)[p(X) > (AY)W(X,Y)], 


where y and wW are arithmetical. Suppose that for every countable model M of 
RCAg and every A € S™ such that M & y(A) there is a G © M such that 
M([G] & RCAp + W(A, G). Then T + P is jf conservative over T. 
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Proof. Fixa countable model M & RCAo. We define a chain of models Mop < My < 

- of RCAg, each an w-submodel of the next. Let Mo = M. Now fix s € w, and 
assume by induction that have defined a countable model M, of T for each t < s, 
along with an enumeration (A;,¢ : e € w) of all A € S™ such that M; § y(A). 
Say s = (t,e), so that t,e < s. Since M, is an w-submodel of M,, it follows that 
Ms § ~(At,c). Hence, by assumption there isa G C M such that M,[Y] & W(A, G). 
Let Ms41 = Ms[G]. This completes the construction. Now, let M* = U, Ms. 
(More precisely, M* has first order part M, and SM = Us SMs_) Each M,, and 
in particular M, is an w-submodel of M”*. From here, it is easy to see that M”* is 
closed under A® comprehension and satisfies 1=°, so M* & RCAg. Moreover, since 
y is arithmetical, each A ¢ S™ such that M* & y(A) must be equal to A;,¢ for 
some t and e. Let s = (t,e). Then by construction, there is a G € S™s+! such 
that My.) & W(A,G). Since SY“! ¢ SM and w is arithmetical, it follows that 
M* & (AY)W(A, Y). So M* & P. Now since M was arbitrary, the conclusion of the 
corollary follows by Theorem 5.9.4. oO 


5.10 First order parts of theories 


This section will establish several conservation results of a certain form. If 7, and 
T> are theories of second order arithmetic, 7, has only arithmetical axioms, and 
T> is conservative over T, for I; formulas, then we call 7, the first order part 
of 7). Similarly, the first order part of an L structure M is the L; structure 
(co 4M ei), 

Conservativity for I; sentences is closely related to conservativity for arithmetical 
formulas with free set variables, and the following results could also be stated in 
terms of arithmetical formulas. 


Theorem 5.10.1. Suppose that N is an Lo structure that satisfies PAT + Dre Then 
there is an Lo structure M that satisfies RCAo and w-extends N. 


Proof. Let N be as above. We let M have the same first order part as N, and let 
the second order part of M consist of all sets that are A® definable in N (with no 
set parameters). It is clear that M satisfies the PA~. Furthermore, M satisfies aN 
comprehension scheme. Indeed, if A is A° definable from sets Bo,...,Bn_1 € S“ 
then each B; is A° definable in M with no set parameters. Hence, by replacing 
references to the B; in the definition of A by x? or m1? formulas as needed, we obtain 
a A° definition of A in M with no set parameters. Thus, A € S M 

We can similarly show that M satisfies the De induction scheme. Indeed, suppose 
y is x and M & (0) A (Vx)[y(x) > y(x + 1)]. Each set parameter in y is an 
element of S™, so each reference to it in y may be replaced by a = or Bi formula 
with no set parameters to obtain a x formula ¢ with no set parameters such that 
M & (Wx)[ v(x) © ¢G(x)]. In particular, M satisfies g(0) A (Vx) [@(x) > G(x+ 1)]. 
By identifying N with the £2 structure with first order part N and second order 
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part @, we can view WN as an w-submodel of M. Therefore, by Theorem 5.9.3, 
N & Q(0) A (Vx) [G(x) > G(x + 1)]. Thus N F (Vx) @(x) by a induction in NV. By 
Theorem 5.9.3 again, M satisfies (Vx) g(x) and hence (Vx)y(x). Oo 


We will obtain several corollaries from this result. The first is a direct application 
of Theorem 5.10.1. The other two provide a link between RCApo and first order 
arithmetic. 


Corollary 5.10.2. RCAo is conservative over [xt for 10 sentences. 


Corollary 5.10.3. An L£, structure satisfies the xt induction scheme if and only if it 
is the first order part of some model of RCAo. 


Proof. Given a £; structure N satisfying the pH induction scheme, consider the 
expansion of NV to a £2 structure N’ with no sets. Then N’ satisfies the 1x? and has 
the same £, theory as NV. Now, by the theorem, N’’ expands to a model M of RCAo, 
which has the same arithmetical theory. Thus, by the same oO 


Corollary 5.10.4. RCAo is conservative over PAT + Ix? (restricted to £L, sentences). 


We next obtain a parallel theorem showing that each model of arithmetical in- 
duction extends to a model of ACAg. 


Theorem 5.10.5. Suppose N is an £5 structure satisfying the arithmetical induction 
scheme. Then there is an Ly structure M that satisfies ACAp and w-extends N. 


Corollary 5.10.6. An £, structure satisfies the scheme for arithmetical induction if 
and only if it is the first order part of some model of ACAo. 


Corollary 5.10.7. ACAo is conservative over Peano arithmetic for sentences in £}. 


It should be noted that, although the results in this section might suggest a larger 
family of conservation results, the cases of RCAg and ACAg are somewhat special. 
For example, it is not true that every model of the full second order induction scheme 
extends to a model of Z2 satisfying full comprehension. 


5.11 Comparing reducibility notions 


At this point, we are ready to compare the reducibility notions of Chapter 4 with the 
reducibility notions based on provability in second order arithmetic. 

In general, if we have a set S whose elements are viewed as problems in some 
sense, a reducibility notion (or just reducibility) on S is simply a reflexive, transitive 
relation on S. Each of these requirements has a natural motivation. Reflexivity 
matches the informal concept that a problem should be reducible to itself. Transitivity 
matches the informal concept that, if we can solve problem P by appealing to problem 
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Figure 5.1. Relationships between <,, <,,, and <aca,. An arrow from one reducibility to another 
indicates that if P and Q are IL, sentences of £2, regarded alternatively as problems, then if P is 
reducible to Q in the sense of the first, it is also reducible in the sense of the second. No additional 
arrows can be added. 


Q, and we can solve problem Q by appealing in the same manner to problem R, then 
we should be able to solve P by appealing to R directly. 

We may regard provability over RCAg as a reducibility in the above sense: we can 
write P <pca, Qif RCAp t Q — P, that is, every model of RCAp+Q satisfies P. Here, 
P and Q range over sentences of £5. As discussed in Section 3.2, when these are 
formalizations of V5 theorems we may think of them naturally as instance—solution 
problems. Notably, this applies to I, sentences of £5, of which we will see many 
examples throughout this book. 

In this case, we may compare <pca, directly with our prior reducibilities, of which 
the strongest to consider are <, and <,,. It is easy to see that if P <, Qor P <pca, Q 
then P <,, Q, but that no additional relationships between the three reducibilities 
hold. (We have already seen examples of problems P and Q such that P <,, Q but 
P £€- Q. We will see examples of problems with P <,, Q but P rca, Q in the next 
chapter.) See Figure 5.1, and compare with Figure 4.6. 

Reverse mathematics was first associated with reducibilities such as <pca,. Re- 
ducibilities such as <,, were also of interest, especially for nonreducibility results: 
a result that P <pca, Q gives more information than P <,, Q, while a result that 
P ¢., Q gives more information than a result that P aca, Q. Interest in <, and re- 
lated reducibilities (<w, etc.) came later. When we prove a result of the form P <, Q, 
we are establishing the direct computability of solutions of P from solutions to Q. If 
we instead establish P <,, Q, we are showing a more indirect form of computability 
of solutions, because Q may be applied many times to solve a single instance of P. 
This form of reducibility requires the game theoretic approach of Theorem 4.7.5. 

When we establish a result of the form P <pca, Q, we are certainly establishing an 
indirect form of computability, but perhaps more importantly we are establishing the 
verifiability of solutions to P. This dichotomy between computability and verification 
is parallel to the dichotomy between computational complexity and verification in the 
analysis of algorithms. (For more on this dichotomy in proofs over RCAg, including 
some means of measuring “how much” of the use of a principle in a proof is for 
computability and “how much’ for verification, see the recent work in [82, 151, 256].) 

The lack of relationships between <pca, and <¢_ is a motivation for considering 
both of these reducibilities in practice. We can use precisely defined reducibility 
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notions to describe the aspects of computability, verification, and uniformity that we 
see when studying theorems. In some cases, the same proof method establishes both 
P <_ Q (or even P <w Q, say) and P <pca, Q. This happens surprisingly often: 
many theorems have proofs that are both effective and uniform. 

In other cases, examining the proof that P <pca, Qreveals anonuniformity. Often, 
nonuniformity appears through use of the law of the excluded middle in a situation 
where we cannot effectively decide which of a number of possible cases holds. We 
have seen examples of this in our dealings with <, and <w. Here is another, using 
Ramsey’s theorem for singletons (Definition 3.2.5). Consider the following proof in 
RCAg that RT; implies see Given a coloring f: w — 3, if f(x) = 0 for infinitely 
many x then we are done. Otherwise, there is a number m such that f(x) > 0 
for all x > m. Then f*(x) = f(x + m) — 1 induces a coloring f*: w — 2, and 
we can compute an infinite homogeneous set for f from any such set for f*. The 
nonuniformity here is in the question whether the first color is used infinitely often. 

Of course, there are many alternate proofs of a theorem — especially if we consider 
all the possible formal proofs. We might ask whether there is an alternate proof that 
avoids this kind of nonuniformity. If we can show that |P| <w |Q|, this shows in a 
precise way that nonuniformity cannot be avoided, in the sense that there is no proof 
uniform enough to give a uniform algorithm for reducing P to Q. 

Yet another possibility is that the proof that P <, Q uses strong logical axioms, 
such as to verify its correctness. For example, the proof method of RT; = RT} 
sketched above yields an inductive proof that RT; implies RT! as follows. Given 
f:w 7 k+1if f(x) > O, if there is no infinite homogeneous set for f with 
color 0, then by induction there is an infinite homogeneous set for f with color 
i > O, using the same technique as above with an alternate coloring f*. In ordinary 
mathematics, this induction is hardly noticed. However, when formalized into second 
order arithmetic, this is induction on the existence of a coloring: a xt induction. We 
may ask if there is an alternate proof that stays within the resources of RCAg. The 
theorem that RT! KRCAy RT, in Chapter 6, shows that the answer is no. In fact, 
Theorem 6.5.1 characterizes precisely how much additional induction is required, 
beyond RCAp, to prove RT!. 


5.12 Full second order semantics 


By convention, second order arithmetic is formalized with first order semantics. 
This means simply that in any £ structure M the collection of sets S“ may be 
an arbitrary subset of the powerset P(M). There is an alternative semantics which 
can be employed, full semantics, in which S™ is required to contain all subsets of 
M. Intuitively, this change will result in fewer possible structures, and thus fewer 
consistent theories. We will call an £5 structure M full if S“ = P(M). 

If we only consider full models, rather than arbitrary models, the resulting se- 
mantics is known as full second order semantics. The next theorem will show that 
these semantics are more powerful than the usual semantics defined in the previous 
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section (which are known as Henkin semantics in the literature). However, as we 
will also see, the cost of using full second order semantics is that Theorem 5.1.5 no 
longer holds. 

The key example of full second order semantics is in Peano’s original axiomati- 
zation of arithmetic. The following definition captures the essence of that axiomati- 
zation, but is not identical to Peano’s original axiomatization. 


Definition 5.12.1. The second order Peano axioms, PA’, are a set of axioms for 
arithmetic in a two sorted logic with one sort for numbers and a second sort for sets. 
The signature for PA” has a constant symbol 0 of type 0 and a unary function symbol 
S of type 0 — 0. There are only three axioms. The first two state that 0 is not in the 
range of S, and that S is an injection. 


1. S(n) #0. 
2. S(n) = S(m) > n=m. 


The third axiom is known as the second order induction axiom. It states that any set 
which contains 0 and is closed under S' must contain every element of the domain. 


3. (VX)[(0 € X A (Vn) [n € X > S(n) € X]) > (Van) [n € X]]. 


We define structures for the signature of PA? as we did for the signature £2. The 
only difference is that + and - have been replaced by S. 


Theorem 5.12.2. [f M is a full structure satisfying the Peano axioms, then M is 
isomorphic to w with its standard successor operation. Hence M is isomorphic to 
an w-model. 


Proof. Suppose that M = (M,P(M),S) is a full model of the Peano axioms. 
Consider the set D = {0, S(0), S(S(0)),...} = {S*(0) : k € w}, which is a subset of 
the domain of M and thus an element of S(M) = P(M). Because this set contains 
0 and is closed under S, and because M satisfies the induction axiom, D = M. Thus 
the map f: k +> S*(0) is a surjection from w to M. 

We next prove that f is injective. Suppose not. Then because w is well founded 
there must be a minimal / € w such that there is some k < / with S‘(0) = S'(0). 
It cannot be that k = 0, because of axiom 1. Thus we may write k = p+ 1 and 
1=q+1,so that S(S?(0)) = S(S4(0)). Then, by axiom 2, $?(0) = S4(0). Because 
k < 1, we have p < q, and thus we have a smaller counterexample to injectivity, 
which is impossible. Thus f is a bijection from w to M. Finally, by definition, we 
have S(f(k)) = f(k + 1), and thus f is an isomorphism from w with the standard 
successor operation to M. Oo 


Hence, in particular, if M is a full structure, then its first order part is standard. 
Hence, “full semantics” is sometimes also known as “standard semantics’. 

The key point is that any set that we can form during the proof is “already covered” 
by the quantifier in the third axiom, under full semantics. This mixture of the object 
theory (the Peano axioms) and the theory in which the proof is written is a unique 
feature of full semantics. 
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It is natural to ask what theory can be used to formalize the proof. One answer is 
that the proof can be formalized in ZFC if we also formalize second order semantics 
in ZFC. That is, if we write 


“M satisfies the Peano axioms under full semantics” (5.7) 


as a sentence in the language of ZFC, then ZFC proves 


“If M satisfies the Peano axioms under full semantics, 


ae ; : Bhs oh 5.8 
then M is isomorphic to w with the usual successor operation G5) 


as a theorem. Of course, there are nonstandard models of ZFC, each of which has 
its own w. Thus, when we read (5.8) translated into ZFC, it simply says that within 
a fixed model of ZFC, a set which appears to be a model of the second order Peano 
axioms is isomorphic to the w of that model. 

It is also possible to interpret (5.7) informally, in the usual style of mathematics. In 
this case, the universal set quantifier in the third axiom would range over all subsets 
of M that exist, and the proof shows that M is isomorphic to the standard w. The 
reader may decide whether to interpret full second order semantics in the formal 
sense of the previous paragraph or in the informal sense of this one. 

Theorem 5.12.2 also shows that the downward L6wenheim-Skolem theorem fails 
for second order theories with full semantics: although PA? is a countable theory, it 
has no model M in which both M and S™ are countable. The exercises show that 
the completeness and compactness theorems also fail. 


5.13 Exercises 


Exercise 5.13.1. Let N and M be £5 structure with N an w-submodel of M. 
Prove by induction on complexity that if y is an arithmetical formula, possibly with 
parameters from N = M and S%, then N & g if and only if M F g. 


Exercise 5.13.2. Show the following can be defined by bounded quantifier formulas. 


1. The set of numbers that are codes for finite sets. 

2. The set of numbers that are codes for finite sequences. 

3. The set of (y,m) such that y is the code for a finite set X and m = max X. 

4. The set of (y, /) such that y is a code for a finite sequence and / is the length of 
the sequence. 

5. The set of (y, i, k) such that y is a code for a finite sequence, i < lh(y), and k is 
the ith element of the sequence. 


Exercise 5.13.3. Prove that the following properties of a set T C N< can be ex- 
pressed as mi formulas: 


1. T is a tree. 
2. T is a binary tree. 
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The following properties of a set T can be expressed as 18 formulas: 


1. T is an infinite tree. 
2. T is a finitely branching tree. 


Exercise 5.13.4. Show the following are provable in RCAo: 


1. The graph of each primitive recursive function is coded by a set. 
2. If f is a function then, for each k and each n there is a finite sequence 


(n, f(n), f(f(n)),-.-5 f*(n)). 


3. If o and 7 are finite sequences then so is the concatenation oT. 


Exercise 5.13.5. Show that if p(x) is a x? formula of £2 then RCAg proves the 
following: if there are infinitely many x such that y(x) then there exists an infinite 
set X such that y(x) for all x € X. 


Exercise 5.13.6. Show that for every n > 1, every model M of RCAo, and every 
function f: M — M (which may or may not belong to S™), the following are 
equivalent. 


1. f is Z° definable in M. 
2. f is 1° definable in M. 
3. f is A° definable in M. 


Exercise 5.13.7. In principle, we could consider an alternate definition of an L2 
structure as a tuple 


Meas” a OMI ee"), 


where S™ is now an arbitrary set of objects and €™ is a fixed subset of M x S™. 
Suppose that M is a structure in this alternate sense that satisfies the extensionality 
axiom 
(VX,Y)[X =Y © (Vn)[ne X one Y]]. 


Prove that M is isomorphic to a structure as in Definition 5.1.2. Thus there is no 
loss of generality in assuming that S“ ¢ P(M) and that ¢™ is the set membership 
relation. 


Exercise 5.13.8. Suppose that y(X) is an £2 formula with one free set variable, 
which may have parameters. Show by induction on the structure of » that 


(Vn)[ne X one Y] > [y(X) © Qg(Y)]. 


Exercise 5.13.9. Show that there is an effective second order T such that every finite 
subtheory of T has a full model but T itself does not. This shows that the compactness 
theorem will fail for second order logic with full semantics. 


Exercise 5.13.10. Show that there is an effective second order theory T that is 
consistent (that is, there is no formula such that both y and -y are derivable from 
T) but T has no full model. This shows that the completeness theorem fails for second 
order logic with full semantics. 
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Exercise 5.13.11. Prove that there is a definitional extension T of the Peano axioms 
so that any full model of T is isomorphic to the standard model of second order 
arithmetic (w, P(w), S,+, X, <, =). 


Exercise 5.13.12. The results stated in Theorem 5.1.5 are typically stated and proved 
only for single sorted first order logic. We sketch how to reduce the two sorted logic 
to the single sorted one. First, extend the signature with an additional unary predicate 
N. Given an £4 formula y, translate it recursively to a single sorted formula y’ with 
the rules 


((Va)y(n))’ = (Wy)(Ny > w'(y)), 

((An)w(n))’ = (Ay)(Ny Aw'(y)), 
(VX) (X))" = (Vy) (Ny) > w'(y))), 
((AX)W(X))’ = (Ay)(Ny) A w'(y)), 


Ww’ =wW — otherwise. 


Given an £5 theory T, let T’ be the single sorted theory {y’ : y € T}. Show that 
T is consistent if and only if T’ is consistent in first order logic. Use this method to 
produce a proof of Theorem 5.1.5 using the corresponding results for single sorted 
logic as lemmas. 


Exercise 5.13.13. Prove that there is a x? formula 2(e,n, X) such that for each Pa 
formula y(n, X) there is an e* € w such that RCAg proves 


(Vn) [(e",n, X) @ w(n, X)]. 


This formula z is one kind of universal zt formula. 
More generally, for each n > 0 there is a ° formula y(e,n, X) such that for each 
©? formula y(n, X) there is an e* € w such that RCAg proves 


(Vn) [p(e*,n, X) > W(n, X)]. 


Exercise 5.13.14. Show that the systems RCAg, WKLo, and ACAp are all finitely 
axiomatizable. (Use the universal formula from Exercise 5.13.13). 


Exercise 5.13.15. Let g(X) be any pi formula of £5 with no free variables other than 
X. Prove that there exist formulas yy (x) and yy (x) with the same parameters as y and 
no free variables other than x such that RCAg + (VX)[y(X) © (SJ) [gs(X TJ] 


(VA) [on(X TyZ)I. 


Exercise 5.13.16. Let y(X) be a xt formula of £2. Prove by induction on the com- 
plexity of y that there is a ze formula 6(x) such that RCAg + p(X) © (AK)O(X [ k). 


Exercise 5.13.17. Prove Lemma 5.2.2 by structural induction. Then prove Theo- 
rem 5.2.3. The key case is for xf formulas. Given a x formula, it can be shown that 
the set being defined is enumerable. Conversely, given a = set, use Kleene’s T and 
U predicates and Lemma 5.2.2. 
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Exercise 5.13.18. Show the following statements are equivalent to WKLo over RCAo. 


1. The =, separation principle, i.e., for all pairs y and w of pr formulas of Lo, if 
(Vx)[y(x) — =wW(x)] then there is a set Z such that 


(Vx) [y(x) > x EZAW(x) OxE€Z]. 


2. If f: N— Nand g: N — N are functions with disjoint ranges, then there is a 
set Z such that 


(Vy) LC) F@) = y] ~ ye ZA (Ax) [gQ@) =) > ye Z). 


Exercise 5.13.19 (Arithmetical transfinite induction in ACAg). Prove the fol- 
lowing transfinite induction scheme in ACAg. Suppose WO(X), and let w be an 
arithmetical formula so that w(0,) holds and, for all x € X, if w(y) holds for all 
y <z x, then w(x) holds. Then w(x) holds for all x. 


Exercise 5.13.20. The following exercise shows that although ACAg is weaker than 
ATRo, it can prove the following restricted form of arithmetical transfinite recursion. 
Arguing in ACAop, suppose WO(X), and that f: N — N is a function such that for 
every x € X and every e, if OX = (S, : y <y x) then Oe) = (Sy : y <x x). Then 
(S, :x € X) exists. 


Exercise 5.13.21. 


1. Prove in RCAg that <xp is a dense linear order on N<N. 
2. Prove in ACAo that for all trees 7 C N<“, WF(T) <& WO(KB(7)). 


Exercise 5.13.22. Prove Theorem 5.8.9. There are two stages for the reversal. First, 
create a reversal to show that the principle implies ACAg. Then, working over ACAg, 
use Kleene’s normal form theorem to convert an instance of ot comprehension into 
its associated sequence of trees. 


Exercise 5.13.23. If T C [w]<“ is a tree, a path f € [T] is the leftmost path if, for 
all g € [T] and all n, f(n) < g(n). Prove the following. 


1. Over RCAg, ACAg is equivalent to the principle that every infinite tree on {0, 1} 
has a leftmost path. 

2. Over RCAg, TI}-CAo is equivalent to the principle that every infinite tree on w 
has an infinite path. 


Exercise 5.13.24. Prove in RCAg that for every infinite set S, the principal function 
of S exists: ie., there exists a function ps: N — N with range(ps) = S and 
Ds(n) < px(m) for all n < m. 


Exercise 5.13.25. Prove in RCAo that there is a bijection N x N > N. (Hint: Start 
with the function f(m,m) = (n,m) defined in Theorem 5.3.4. Let S be range of f 
and prove that it exists. Then, consider Ps of.) 
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Theorem 5.8.22 
Definition 5.8.17 (ATRo ] 
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Definition 5.8.14 A -CAo 
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Definition 5.6.8 ACA 
Proposition 5.6.9 
Definition 5.6.5 ACA) 
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Definition 5.6.1 ACAo KL 
Proposition 5.6.14 
Definition 5.6.13 (WKLo 
Corollary 9.11.6 
Definition 9.11.1 WWKLo 
Corollary 9.11.3 


Definition 5.5.2 RCAo 


Figure 5.2. Various subsystems of Z2, in order of strength. No double arrow can be reversed. All 
separations are witnessed by w-models except between ACAg and ACA), whose w-models coincide; 
the references to separations are indicated. The so-called “big five” are circled. The system WWKLo 


will be defined in Chapter 9. 


Chapter 6 ®) 


Check for 


Induction and bounding cpa 


The most common kind of reverse mathematics result shows that a given problem 
requires a particular set existence axiom to solve. Accordingly, we talk of the second 
order strength of the principle. A number of combinatorial principles also require ad- 
ditional induction axioms, beyond xt induction, to prove in second order arithmetic. 
A landmark example is RT', which is equivalent to the bounding principle Bx? over 
RCAo. Being computably true, RT! has no second order strength beyond RCA. But 
it does consequently have additional first order strength. In this chapter, we describe 
the Kirby—Paris hierarchy of induction and bounding principles, which can be used 
to measure the first order parts of principles. We discuss reverse recursion theory, 
and prove the result mentioned above, due to Hirst, about RT!. 


6.1 Induction, bounding, and least number principles 


To begin our discussion of induction strength, we first define several important 
first order principles and establish the basic relationships between them. We open by 
recalling the definition of induction from Definition 5.3.5. Recall that for a collection 
T of formulas of £5, the I induction scheme (IV) is the scheme over all y(x) € T of 
sentences of the form 


(p(0) A (¥x) g(x) > get 1) > (Wx) g(x). 


Our interest here will be almost exclusively in cases where I is a class of arithmetical 
formulas. 

The notation II’ originates in the study of models of first order arithmetic, where 
the class [ above would be a class of formulas of £; rather than £3. For this reason, 
some authors prefer to reserve the notation IT’ for £; formulas exclusively, and use 
instead the separate notion I’-IND when dealing with £2 formulas. We will not do so, 
largely because the potential for confusion is minimal. In proofs over a theory (like 
PA) that can be viewed as a theory in either language, the presence or absence of set 
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variables tends to make no difference. Since our interest is second order arithmetic, 
we can think of everything as happening in L2 for simplicity. But most results below 
were originally formulated in £1, and they hold equally well in either case. 

To next principle BI’ may not immediately look like an induction statement. 


Definition 6.1.1. Let [ be a collection of formulas of £2. The bounding scheme (or 
T collection scheme) (BI) is the scheme consisting over all y(x, y) € I of sentences 
of the form 


(Vz)[(Vx < z)(Ay) g(x, y) > (Aw) (Vx < z)(Ay < w)g(x, y)]. 


One way of thinking about bounding is that, in the expression (Vx < z)(Ay)y(x, y), it 
allows us to shift the existential quantifier outside the scope of the bounded universal. 


Theorem 6.1.2 (Parsons [239]). Fix n > 0 and let t be a term. 


1. If (x, Z) is aX formula, then (Vx < t) v(x, Z) is equivalent over PA + BX® to 
a? formula. 

2. If p(x, Z) is a T° formula, then (Ax < t) v(x, Z) is equivalent over PAT + BZ? to 
a VI° formula. 


Proof. The proof is by induction on n. The result is obvious for n = 0 since Ps = TI} 
formulas are closed under bounded quantification. So fix n > 0, and assume the 
result is true for n — 1. We prove (1) for n, (2) being analogous. Say g(x, z) is 
(Ay) w(x, y, Z), where w is Loe Then in PA~ + BE2, we have 


(Vx < t) g(x, Z) © (Aw) (Vx < t)(Ay < w)W(x, y, Z). 


Applying the inductive hypothesis to (Ay < w)w(x, y,Z), we obtain an equiva- 
lent jh ae formula 6(w,x,z). Thus, (Vx < t) p(x, Z) is equivalent to (Aw)(Vx < 
t)0(w, x, Z), which is X°. Oo 


Bounding is more than just a technical tool for making quantifiers behave, however. 
In Section 6.5, we will see several examples of how it shows up (almost insidiously) 
in the most common types of arguments. 

With Proposition 6.1.2 in hand, we can now state and prove the following famous 
facts from the study of models of arithmetic. 


Theorem 6.1.3 (Parsons [239]; Kirby and Paris [237]). Fix n > 0. The following 
are provable in PA. 


1, BE is equivalent to BIT?. 

2. 1° + BX}+ implies BXY. 

3. 1X) + BE? | implies |X). 
n+1 n 


Proof. For (1), the implication Bee 4 — BII® is immediate. For the converse, 
suppose we are given a pa formula v(x, y) = (Au)w(x, y,u), where w is I1°. Fix 
any z, and suppose that (Vx < z)(Ay)y(, y) holds. Define 6(x, v) to be the formula 
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(Vy,u < v)[v = (y,u) > W(x, y,w)], 


which is II?. Then we have (Vx < z)(Av)@(x, v), and so by BIT? we can fix w such 
that (Vx < z)(Su < w)6(x, v). This means (Vx < z)(Ay < w)y(x, y). 

Part (2) is proved by (external) induction on . For n = 0, there is nothing to show 
since we are arguing over PAT + BE. So fix n > 0, and assume the result is true 
for n — 1: that is, PAT + 1Z2_, + BX? implies BX?_,. Arguing in PAT + IZ? + BX), it 
suffices to prove BIT? 


nl: DY Part (1). So suppose g(x, y) is a 11 ae formula such that 
for some z, 


(Wx < z)(Ay)y(x, y). 


Let w(u) be the formula u > z V (Aw)(VWx < u)(Ay < w)y(x, y). By inductive 
hypothesis, we have BE. so we may apply Proposition 6.1.2 to conclude that w is 
equivalent to a = formula. Now obviously, (0) holds, and it is easy to see that so 
does (Vu) [W(u) > w(u + 1)]. By ©° induction, we conclude (Vu)W(u). Then w(z) 
gives the conclusion of BZ? applied to y. 

For (3), we again proceed by induction on n. For n = 0, the result is trivial since we 
are working in PAT + [=0. So fix > 0, and assume the result is true for n— 1. We now 
argue in PA-+B? .. Let v(x) be a 2° formula assume y(0)A(Vx)[¢(x) > y(x+1)]. 
Write v(x) as (Ay) w(x, y) for some 4 formula yw, and define 


A(x, y) = W(x, y) V (Vo) W(x, v), 


which is a disjunction of a ml , formula and a T° formula, and so is I1°. Fix any z. 
Then we have 


(Vx < z)(Ay)O(, y), 
and so by BII® (which is equivalent to BZ°,, by part (1)) there exists a w such that 


ntl 
(Vx < z)(Ay < w)(x, y). 


In particular, for all x < z, if w(x, y) holds for some y then it holds for some y < w. 
Now, by Proposition 6.1.2, (Ay < w)wW(x, y) is equivalent to a Tr, formula. And 
since BE? — Br? 


n? 


it follows by inductive hypothesis that we may avail ourselves 
of ie) So, by induction on the ee formula 


a(ay < w(x, y) > x >, 


we conclude that (Vx < z)(Ay < w)W(x, y). In particular, g(x) holds for all x < z. 
Since z was arbitrary, this yields (Vx) y(x), as needed. im 


We will see in Section 6.3 that the implication in parts (2) and (3) above are strict. 
That is, Bre ,1 18 intermediate between 1x° and Ix? 4: This is complemented by a 
result of Slaman that we discuss at the end of this section. 

We add a further principle to our discussion, this time one that is very clearly just 
a restatement of induction. We formally prove the equivalence below. Nonetheless, 
having this formulation explicitly is very convenient. 
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Definition 6.1.4. Let I. be a collection of formulas of £5. The I least number 
principle (LI) is the scheme over all y(x) € T of sentences of the form 


(Ax) g(x) > (Ax) [y(@) A 7(y < x) e(y)]- 
Proposition 6.1.5. For each n > 1, PAT + 1Z2 & HT? @ LY® & LIT. 


Proof. We argue in PA’. 


(1° — 11°): Assume 12° and suppose y(x) is a H2 formula such that y(0) A 
(Vx)[y(x) > v(x + 1)]. If sy(xo) for some xo, define 


O(x) =x > xo V (Ay) [Xo =x+y A 7e(y)]. (6.1) 


Thus we have 6(0). Now, suppose 6(x) holds for some x. If x > xo then obviously 
6(x + 1) holds as well. If x < xq then =y(xp — x) holds by definition, and hence 
we have =y((xo — x) — 1) by assumption on gy. So, again 6(x + 1) holds. By = 
induction, we conclude (Vx)6(x). In particular, we have (xo) and hence 7y(0), a 
contradiction. 


(IH? — 122.) Assume IIT? and suppose v(x) is a £2 formula such that y(0) A 
(Vx)[y(x) — v(x + 1)]. The proof is the same as that of the previous implication, 
except that (6.1) is replaced by 


A(x) =x > xo V (Vy) [xo = x+y > 7e(y)]. 


(1=° — LIT?): Assume IZ2. Let g(x) be a TI? formula, and define w(x) to be the 
formula (Vy < x)-=y(x). By Proposition 6.1.2, w is equivalent to a £° formula. Now, 
if there is no least element satisfying y then we must have w(0), and for every x, 
w(x) must imply w(x +1). By £2 induction, this yields (Vx) w(x), hence (Vx)=g(x). 


(LI? — 12°): Assume LI? and suppose v(x) is a £2 formula. If (Vx) g(x) does not 
hold, we may let xq be the least x such that =y(x). Now either xo = 0, in which case 
-=¢(0) holds, or x9 > 0, in which case (Ax) [y(x) A ay(x + 1)]. 


(IM? —> LE°): Same as the proof that 1=2 — LIT?. 


(L=° — IH°.) Same as the proof that LIT? — 1x2. Oo 


The following useful corollary is a formalization of the computability theoretic 
fact that every x definable subset of w is computably enumerable. 


Theorem 6.1.6. Let y(y) be a = formula of £2. Then RCAg proves there is a 
function f: N — N such that for every y, y(y) holds if and only if y € range(f). 
Moreover, if p(y) holds for infinitely many y then f may be chosen to be injective. 


Proof. We reason in RCAo. First, suppose there is a finite set F such that for all y, 
y € F if and only if y(y) holds. In this case, simply define f by f(y) = y for all 
y € Fand f(y) = min F forall y ¢ F. Clearly, f exists and has the desired property. 
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So suppose next that no such F exists. Write y(y) as (Ax) w(x, y), where w is i. 
Let S be the set of all pairs (x, y) such that 


W(x, y) A (Wx" < x)7W(x", y). 


By LI®, we have for all y that y(y) © (Ar)[(x, y) € S]. Thus if S were bounded 
by some b, we would have that y(y) @ (Ax < b)[(x, y) € S]. Hence, the set of 
y such that y(y) holds would exist by pi comprehension, and being bounded, it 
would be finite (Definition 5.5.10). Since this contradicts our assumption, S must be 
unbounded. 

Let ps be the principal function of S (Exercise 5.13.24). In particular, ps is 
injective and its range is S. Define f by 


(zy) € f — (Ax) (Xz, x, y) € ps]. 


Note that f is a function because ps is, and it is injective because ps is and because 
if (x, y), (x*, y) € S then necessarily x = x* by definition of yw. Since this definition 
is x, it follows by Exercise 5.13.6 that f is A° definable. Hence, f exists. We 


have y € range(f) <> (Ar)[(x, y) € range(ps)] < (Ax)[(x, y) € S] yy), as 
desired. oO 


We conclude this section by briefly turning to more advanced results, without 
any proofs. All of the principles we have discussed above have immediate A° for- 
mulations. Namely, we define |A° to be the scheme consisting of all sentences of the 
form 


(Vx) [e(x) @ =W(x)] > [(9) A (Wx) [¢@) > @ + D]) > (Vx) eI, 


for = formulas y(x) and (x). Similarly, we define LAr to be the scheme consisting 
of all sentences of the form 


(Vx) [¢(x)  =W(x)] > [(Ax) ¢(®) > (Ax) Le) A ay < x) gO), 


again for =° formulas y and w. It turns out that considering this class nicely ties the 
induction and bounding hierarchy together. 


Theorem 6.1.7 (Gandy, unpublished). For each n > 0, 
PA +1) + BE? > LAY. 


A full proof can be found in Chapter I of Hajek and Pudlak [134]. Surprisingly, the 
easy equivalence between induction and the least number principle we saw earlier 
does not go through in the A? case. And indeed, the equivalence is only known to 
hold over a stronger base theory. In PAT + ps we can give a pr definition of the 
relation z = y*, although we cannot prove the formula Exp = (Vx) (Wy) (dz) [z = y*] 
which expresses the totality of the exponential function. 
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Theorem 6.1.8 (Slaman [291]). For each n > 0, 
PA~ +1X) + Exp + BEY © LA? @ IAQ. 


For 1 = 1, a different proof, over a slightly weaker base theory, has been found by 
Thapen [309]. For our purposes, the moral is the same: the bounding scheme really 
is an induction principle after all, just for a restricted class of formulas. 


6.2 Finiteness, cuts, and all that 


We now turn to a discussion of what a failure of induction can look like in a 
nonstandard model of arithmetic. 


Definition 6.2.1. Let M be an £2 (or £)) structure. Then J C M isa cut of M if: 


¢ for all x, y € M,ifx € Tandy <M x, then y € J, 
¢ for all x € M, if x € J then so is the successor of x in M. 


Tis a proper cut of Mif 1 # M. 


Thus, one way ¥2 induction can fail in a model M is if M has a proper £2 definable 
cut. We will see in Theorem 6.2.3 that the converse is true as well. We begin this 
section with the following well-known result. The gist is that if M satisfies enough 
induction, then any arithmetical property holding inside or outside of a cut must spill 
across the cut. 


Proposition 6.2.2. Fix n > 0 and M & PA” +122. Suppose I is a proper cut of M, 
and let y(x) be a 2° formula. 
1. (Overspill): If M § y(a) for every a € I, then M & v(b) for some b € M \ I. 
2. (Underspill): Jf M & y(b) for every b € M\ I then Mt g(a) for some a € I. 


Proof. For (1), suppose otherwise. So for all a € M, we have M & y(a) if and only 
if a € J. As J is acut, this means 


M & 9(0) A (¥x)[ p(x) > g(x + I)]. 


But then by Ix? we have J = M, a contradiction. Part (2) is proved the same way, 
using the equivalence of IZ? with ITI°. Oo 


The following fundamental result relates cuts, induction, and boundedness. 
Theorem 6.2.3 (Friedman, unpublished). Fix n > 1 and suppose M & PA” +12}. 
The following are equivalent. 


T Me eS, 
2. M has no ° definable proper cut. 
3. Every 2° definable function on a proper cut of M has bounded range. 
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Proof. The implication from (1) to (2) is clear: if M has a £° definable proper cut, 
the formula defining this cut witnesses a failure of =° induction in M. Likewise for 
the implication from (2) to (3), since the domain of every = definable function is 
= definable. It therefore remains to prove that (3) implies (1). 

We proceed by induction on n. If n = 0, there is nothing to prove since M & zh. 
So fix n > 0, assume (3) for n, and assume that the implication holds for n — 1. By 
inductive hypothesis, we have Dim , and hence also BE? js However, suppose |X? 
fails. Let y(z) bea = formula such that 


M é £(0) A (Vz) [¢(z) = o(z + LD] A 7¢(ce) 


for some c € M. Say y(z) is (Aw)W(z, w), where w is a Now define 6(x, y) to 
be the formula 


y={zZ,wy Ax <z<cAw(z,w) A (W{2",y") < (2, w)) aw (2", w"*). 


The rightmost conjunct here x , by Proposition 6.1.2, so the entire formula is >. 
First, note that 9 defines a function ona proper cut of M. Indeed, the domain of 6 is 
the set of a € M such that M & (Az)[a < z < cA y(z)], whichis a cut bounded by c. 
For any such a, it is clear that M satisfies (Ay)[y = (z,w) Aa <z<cAw(z,w)]. 
Then Loe shows that M satisfies (Ay)@(a, y). 
Next, we claim that @ has unbounded range. If not, fix d € M so that 


Me (Vx) (Vy)[@(x, y) > y < d]. 


In this case, let £(z) be the formula (Aw < d)W(z, w). Then, by Proposition 6.1.2 


again, € is equivalent to a I. , formula, and it follows that 


M & &(0) A (Wz) [E(z) > Sz + DIA WSCC), 


which contradicts cas This proves the claim. 
But now we have exhibited a function witnessing that (3) is false for n, contrary 
to assumption. Thus, IZ? must hold and the proof is complete. oO 


An important point of caution is that nonstandard models are not characterized 
exclusively by failures of induction. Indeed, even PA and RCA, which include induc- 
tion for all £2 formulas, have nonstandard models. In particular, although it can be 
intuitively tempting to conclude that the models of RCA are precisely the w-models 
of RCAo, this is not the case. 

It is interesting and instructive to compare the equivalence of (1) and (3) above 
with the equivalence in the following. 


Proposition 6.2.4. Fix n > 1 and M & PA” + xo, + Bx? The following are 
equivalent. 


1. Mr B?. 
2. For every b € M, every =) definable function on [0, b) has bounded range. 
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Proof. That (1) implies (2) follows immediately from the definition of Bx?. Con- 
versely, assume (2) for n. We show BIT? , holds in M. Let v(x, y) bea a formula 
such that for some b € M, 


M & (Wx < b)(Ay) g(x, y). 


Define @(x, y) to be the formula 


x<bAg(x,y) A (Vy* < y)79(x, y). 


Since M satisfies pase it satisfies Bx? By Proposition 6.1.2, the rightmost 
conjunct above is equivalent in M toa ET formula, and therefore 6 is =°. Moreover, 
@ defines a function on [0, b). Indeed, M satisfies that for every x < b there is a y 
such that y(x, y) holds, hence there is a least such y by Ea ,- By (2) we can fix a 
c € M bounding the range of 6. This means M E (Vx < b)(Ay < c)y(x, y). im 


We saw in Section 5.5.2 that, in a model M of RCAo, we cannot treat every 
bounded subset of M as we would a bounded (and hence finite) subset of w. The- 
orem 5.5.8 shows that we can do so if X ¢ S™. The following result extends the 
theorem to all =°-definable subsets of M under the additional hypothesis that M 
satisfies ©° induction. 


Proposition 6.2.5. Fix in > 1 and M & RCAo. The following are equivalent. 


1. Me Ix?. 
2. Every 2° definable bounded subset of M is M-finite. 


Proof. To prove (1) implies (2), assume M § 1X9. Let g(x) be a X° formula and fix 
k € M. Let m = k!, which can be defined in M by primitive recursion as in the proof 
of Theorem 5.5.8, and consider the following formula in which n is free: 


(Vi < k)[y(i) > m(i+ 1) + 1 divides n]. (6.2) 


Obviously, (6.2) holds in M of [];<, m(i+ 1) + 1. Moreover, by BZ? in M, (6.2) 
is I1?. Hence by LII°, there is a least n € M satisfying (6.2). Suppose there is an 


i <™ k such that M & m(i+1)+1 divides n but sy(i), and write n = (m(i+1)+1)n* 
for some n* <™ n, By the proof of Theorem 5.5.8, M satisfies 


(Vi <k)[j #i-7 m(j +1) +1 and m(i+ 1) +1 are relatively prime]. 


Hence, n* still satisfies (6.2) in M, contradicting the choice of n. We conclude that 
in M, 
(Vi < k)[~@) © m(i+ 1) + 1 divides n]. 


In other words, M satisfies that (k, (m,n)) represents {i < k : y(i)} in the sense of 
Definition 5.5.5, and hence the latter set is M-finite. 

Conversely, let p(x) be a £2 formula such that M & y(0)A(Vx)[y(x) > o(x+1)] 
and fix an arbitrary a € M. By (2), theset F, of b € M suchthat MF b < at+lAg(b) 
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is M-finite. Using a code for F, allows us to express x € Fg as a xe formula. Let 
W(x) be the Da formulax > aVx € Fy. Then M & W(0) A (Vx) [W(x) — W(x 4D). 
By Ix? in M, we conclude that M & (a). oO 


We can use this result to generalize bounded ut comprehension (Theorem 5.5.11) 
to higher levels of the arithmetical hierarchy. Of course, ACAg includes regular 
(unbounded) £° comprehension for each n € w, while WKLo only includes = 
induction. The generalization here is useful if we work in systems such as RCAg+ Ix). 


Theorem 6.2.6 (Bounded =° comprehension). For every n > 1, and every =° 
formula (x), 


RCAg + 12° + (Vz)(AX)(Wx) [x € X Ox <zAQ(x)]. 


Proof. Fix M & RCAg +122 along with a € M. By Proposition 6.2.5, there is a code 
ao € M for the set F = {x < a: y(x)}. As before, the code allows us to express 
xé€Fasa pay formula. Thus the set F can be formed using x comprehension. dO 


Another nice corollary of Proposition 6.2.5 is the following variation of the 
finitary pigeonhole principle. 


Proposition 6.2.7. Fox each n > 1, the following is provable in RCAy +12: for each 
x > I, there is no ve definable injection x — y for any y < x. 


Proof. We argue in RCA + 1X2. By Proposition 6.2.5, every 2° definable function 
x — y for x, y € N is coded by a number. Hence, the statement to be proved is a mn 
formula in x, and so we can prove it using II’. Clearly, the result holds for x = 1. 
So fix x > 1 and assume the result holds for x — 1. If the results fails for x, we can 
fix a (code for a) witnessing injection f: x — y for some y < x. Without loss of 
generality, we can take y — | to be in the range of f. (More precisely, by LIT? we can 
fix the least y* < y such that (Vz > y*)(Vw < x)[f(w) # z], and then replace y by 
y* if necessary.) Now, we must have y = x — 1, as otherwise f [x — 1 would be an 
injection from x — | into a number smaller than x — 1, which cannot be. Likewise, if 
f(x -1) =x-2 then f }x — 1 would be an injection from x — | to x — 2, again an 
impossibility. Thus, there exists w* < x — 1 so that f(w*) =x -—2 and f(w) <x-2 
for all w < x different from w*. We can then define g: x - 1 — x —2 by 

f(w) ifw#w*, 

g(w) = . 

f(x-1) ifw=u", 
for all w < x — 1. Now g is obviously injective on all w < x — | different from w* 
since it agrees with f. And we also cannot have g(w) = g(w”) for any w # w’, since 


otherwise we would have f(w) = f(x — 1) for some w # x — 1. Thus, g is injective, 
which contradicts the choice of a. Oo 
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0! — 0 
Bx? = BIT’ 


0 _ yO 
1x? = 1° 


Figure 6.1. The Kirby—Paris hierarchy; n > | is arbitrary. An arrow from one principle to another 
indicates an implication provable in RCAg. No arrows can be reversed. 


6.3 The Kirby—Paris hierarchy 


The aim of this section is to show that the implications between induction and 
bounding principles at corresponding levels of the arithmetical hierarchy are strict. 
Thus, we have an increasing sequence under provability, as illustrated in Figure 6.1. 
In the literature, this is often called the Kirby—Paris hierarchy after the seminal paper 
of Paris and Kirby [237] in which it was first established. We follow the standard 
outline for separating the principles in the hierarchy, which is roughly the same as 
that of the original proofs in [237], with minor deviations. Similar treatments can be 
found in Kaye [176] and Hajek and Pudlak [134]. 

We first prove separations of the Kirby—Paris hierarchy in £,, and then apply 
conservativity to extend the results to £2 and RCAg. Thus, in the following definition, 
lemmas, and theorems, all structures should be understood to be £, structures. We 
write M rather than M for these structures, to emphasize that they can be identified 
with their first order parts. Exercise 6.7.7 shows the following definition is nontrivial. 


6.3. The Kirby—Paris hierarchy 163 


Definition 6.3.1. Fix n > 1 and M & PA +19. 


1. Kn(M) is the set of all =° definable elements of M (without parameters). 
2. For B C M, K,,(M, B) is the set of all £°(B) definable elements of M. 


We may regard K,,(M, B) as a substructure of M. 


Lemma 6.3.2. For each n > 1, there exists M — PA” + Ix? such that K,(M, @) 
contains nonstandard elements. 


Proof. Consider any my sentence not provable in, but consistent with, PA, e.g., 
Con(PA). Write this as -(Ax) g(x) where y is X°, and fix M & PA + (Ax) y(x). Let 
w(x) be the = formula v(x) A 7(Ay < x) v(x). By Theorem 6.1.3, M & LZ°, and so 
M + w(a) for some a. Clearly, y defines a in M, soa € Ko(M, @). But a cannot be 
standard, else y(a) could be turned into an actual proof (in PA) of Con(PA). oO 


Lemma 6.3.3. Fix n > 1 and M & PA” +129. For every B © M, Kn(M, B) is a x9 
elementary substructure of M and K,(M, B) § PA’. 


Proof. To show K,,(M, B) is a oe elementary substructure of M, we use the 
Tarski-Vaught test. Let p(x, zo,...,Zk-1) be a 1a formula and suppose M & 
(Ax) p(x, bo,...,bx-1) for some bo,...,bx-1 € K,(M, B). We exhibit an a € 
K,(M, B) such that M & y(a, bo,..., bg-1). By Theorem 6.1.3, M & snare and so 
M satisfies 


(Ax) p(x, bo, ...,Bx-1) A a(Ay < x) p(y, bo, -- +, bx-1)]. (6.3) 
We claim that (Ay < x)y(y, Zo,..-,Zk-1) iS equivalent in M to a I, formula 
W(x, yo,--->¥k-1)- If n = 1, this is obvious because y is then TI} and this class of 


formulas is closed under bounded quantification. If n > 1, we instead obtain it from 
Proposition 6.1.2 using the fact that, by Theorem 6.1.3, M & Bx? Now, we can 
fix a tuple of elements ¢ from K,(M, B) and, for each i < k, a X° formula 6; (x, Z) 
such that 6;(x, c) defines b; in M. Let y(x) be the formula 


(Azo, .--. Zk-1) A 6; (x,C) A O(X, Z0,- ++» Zk-1) A AW(X, Z0,-- +5 Zk-1) | - 


i<k 


Then (6.3) is equivalent to (Ax) y(x), so y defines the <™ -least element a € M such 
that MF y(a, bo,..., bg-1). But since y is pe it follows that a € K,(M, B), as was 
required. 

To prove that K,(M, B) & PA’, we refer to Definition 5.3.2. All axioms except 
one are It’, and so since n > 1, they hold in K,,(M, B) because they hold in M. The 
remaining axiom is 

(Vx, y)(Az [x <yrxtz=yl. 


To verify this holds in K,(M, B), fix a, b,c € K,(M, B) such that a <” band Mr 
atc =b. Since M is a model of PA, it satisfies (Vx, z,z*)[x+z=xt+z2° > z=2']. 
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Thus, c is x definable in M using a and b as parameters. Since a and b are =°(B) 
definable in M, it follows that sois c. Hence, c € K,(M, B) and K,(M, B) § atc = b 
by elementarity. oO 


We now proceed to the main results. In Theorem 6.3.4 we show that induction 
for 4 formulas is weaker than bounding for 2° formulas, and in Theorem 6.3.5 
we show that bounding for ° formulas is weaker than induction for =? formulas. 


Theorem 6.3.4 (Parsons [239]). For each n > 1, there exists a model of 
PA~ +1Z0_, +-Bz?. 


Proof (Paris and Kirby [237]). Fixamodel M t& PA containing nonstandard = (and 
hence pa definable elements, which exists by Lemma 6.3.2. Let K, = K,(M,@). 
We claim this is the desired model. 

By Lemma 6.3.3, K, — PA” and K,, is a £2 elementary substructure of M. Also, 
if p(x) isa aa formula such that K,, — (Ax) y(x) then M & (Ax) p(x). By Les in 
M we have that M & (Ax)[y(x) A (Vy < x) ny(x)]. But the latter is a X° sentence, 
and so holds in K,, by elementarity. By Theorem 6.1.3, we conclude that K, § ie 4 

Next, we claim that K, + —BX®. Fix a universal £° formula of the form 
(Az) W(w, x, z), where w is Ds So for each a € K,, there is an e € w such 
that (4z) w(e, x, z) defines a in M. Let @(w, x) be the formula 


A(w, x) = (Az) [W(w, x, z) A (AQ, 2") < (x, 2) (w, x", 2°). 


Proposition 6.1.2 implies that 6(w, x) is equivalent in M to a £2 formula u(w, x). 
By LZ°, for each a € Ky, there is an e € w such that 6(e, a) is true in M, and by 


n? 


elementarity, also in K,,. In particular, for any nonstandard b € K,, we have 
Ky & (Vx < b+1)(Aw < b)O(x, w). 


Now if BZ? held in K,,, then by Proposition 6.1.2, the left hand side above would be 
equivalent in K,, toa x formula. Hence, it would also be true in M, and so 


Met (Vx < b+1)(a4w < b)O(x, w). 


But clearly M & 7(Ax,x*,w)[x # x* A O(x,w) A O(x*,w)], so 6 would be an 
arithmetical definition of an injection from b + | into b in M. This is impossible by 
Proposition 6.2.7 since M & PA. oO 


Theorem 6.3.5 (Lessan [196], Paris and Kirby [237]). For eachn > 1, there exists 
a model of PAT + BZ? + 7129. 


Proof. By Lemma 6.3.2, fix M & PA with nonstandard ay definable elements. 
Now, set By = Ky-1(M, @), and for each i € w, inductively define 


Bois, = {x € M: (Ay € My) [x < y]} 
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and 
Bojx2 = Kn-1(M, Boj+1). 


Clearly, B; © Bi+: for all 7. Let B = Uje,, Bi. Every tuple of parameters from B is 
then contained in B; for some i, and so since B; is a >a elementary substructure 
of M by Lemma 6.3.3, it follows that so is B. 

To show B § B®, let y(x,0) be a 1 , formula and suppose that for some 
d € B we have that BF (Vx < d)(Ay)y(x, y). Since B is closed downward under 
<™, it follows by elementarity that M & (Wx < d)(Ay)y(x, y). Now M & PA, 
so there is a b € M so that M & (Vx < d)(Ay < b)y(x,y). We can also fix 
the <™-least such b; denote this by b*. Then there must be an a* <™ d so that 
M § ¢(a*, b*) A (Vy < b*)7y(a*, y). Since B & (Ay)y(a"*, y) there is a b € B such 
that, by elementarity, M — y(a*, b), and so necessarily b* <™ b. This implies that 
b* e€ B, and hence B & (Vx < d)(Ay < b*)y(x, y). Thus, BF Bil 4 which is 
equivalent to B® by Theorem 6.1.3 (1). 

To show that B & =IZ°, we prove that the standard part of B is £° definable. 
Fix a universal zy formula of the form (4z)w(w, x, y, z), where w is Te gs Define 
6(u, x) to be the following: 


(ai < u)(A(wo,..., wi) < u) (Wj < 1)(Ax,;) (Ay; < x;) 
[(j =0 > (Az)y (wo, Xo, 0, z)) 
A (j > 0 > (Az) (wj,xj, ¥j-1,2)) A (Az)W (wi, x, Yi-1, 2)]. 


Since B satisfies Br?, it follows by Proposition 6.1.2 that this is equivalent in 
Btoa uae formula. Clearly, (Vx)(Vu)(Vv)[u < v A O(u,x) — O(v,x)] is true in 
both M an B. Also, b € M belongs to B if and only if there is an m € w such 
that M & 0(m,a). In particular, if b ¢ B then M & O(a, b) for every nonstandard 
a € M. Conversely, if M § 6(a, b) for some b € M and every nonstandard a € M, 
then by underspill (Proposition 6.2.2 (2)) we must have that M & 6(a,m) for some 
m € w, and hence b € B. Since Bisa xe 1 elementary substructure of M, it follows 
that a € B is nonstandard if and only if B & (Vx)@(a,x), so a € w if and only if 
Be (Ax)76(a,x). Oo 


Corollary 6.3.6. Fix n > 2. 


1. There exists a model of RCAg + Daa + =By?. 
2. There exists a model of RCAy + BE + =1E2. 


Proof. For either part, let M be the corresponding model from Theorem 6.3.4 or 
Theorem 6.3.5. Then, let M be the £2 structure with first order part M and second 
order part all A® definable subsets of M. By Theorem 5.10.1, M & RCAp and has 
the desired properties. oO 
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6.4 Reverse recursion theory 


One place induction and bounding arise in reverse mathematics is in its analysis 
of computability theory, commonly referred to as reverse recursion theory. This is 
usually understood to concern results of classical computability theory, chiefly about 
the computably enumerable (c.e.) sets and degrees. In a model M of PA’, we say 
a set S C M is ce. if it is x definable in M (possibly with number parameters). 
In a model of M of RCAo, we can similarly define S being X-c.e., for X ¢ S™. In 
fact, as constructions in this part of computability theory are, almost by definition, 
computable, we may expect the entire discussion to be accommodated in RCAg. And 
this is almost the case, except for issues of induction. The main concern is that while 
a construction may be computable, the verification that it does what it purports to 
do can be more complex. 

We will briefly survey a few results along these lines. We will omit all proofs, 
which tend to be quite involved in this area. 

A central focus of reverse recursion theory is on the priority method, the hallmark 
technique of modern degree theory. A good introduction can be found in Downey 
and Hirschfeldt [83], Lerman [194], or Soare [293]. A basic priority construction is 
organized into stages, and the properties of the set to be constructed are expressed 
through countably many requirements and assigned relative priorities. There is a 
plethora of different kinds of priority constructions, of course, but even in the 
simplest we must take care when we move to a formal system. Consider a finite injury 
argument. Say our requirements are Ro, Ri, ..., with R; having higher priority than 
R; if i < j. Typically we will have an argument that each &,, if it is unsatisfied at a 
given stage, can “‘act’ (cause us to take some action on its behalf) at a later stage and 
become satisfied. We can then prove that all requirements are eventually satisfied by 
induction. Namely, fix i and assume there is a stage so so that no R; with j < i acts 
at any stage s > sg. Now we can appeal to the earlier argument to find a stage 5; > 50 
at which &; acts and becomes satisfied (if it was satisfied already, we take s; = so). 
Since no R; with j < i will act again after stage s;, this means no requirement can 
injure R;, and hence it stays satisfied at all stages s > s, (and does not act again). 
This much will be familiar from a first course in computability. What may be less 
so is the explicit identification of the induction statement here as Ps (assuming that 
“acting” and “being satisfied” are A properties, as is usually the case). Thus, on its 
face, we need x induction to formalize this argument. 

There are two basic kinds of finite injury arguments, distinguished by whether 
or not the number of injuries to each requirement is computably bounded. The 
most famous example of the “bounded” type is the celebrated Friedberg—Muchnik 
construction of two Turing incomparable c.e. sets, which was historically the very first 
priority argument. (There, each requirement, once all higher priority requirements 
have been permanently satisfied, acts at most once. So the number of times Ri will act 
is at most 2’.) The “unbounded” type is typified by the Sacks splitting theorem, that 
every noncomputable c.e. set can be partitioned into two Turing incomparable c.e. 
sets. It turns out that both the Friedberg—Muchnik theorem and the Sacks splitting 
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can be carried out in xf. In fact, in the former case even weaker axioms suffice, 
while in the latter [xt is optimal. 


Theorem 6.4.1 (Chong and Mourad [37]). PA” + IZ) + BZ? + Exp proves the 
Friedberg—Muchnik theorem. 


Theorem 6.4.2 (Mytilinaios [230]; Mourad [224]). Over PA~ + IX} + BX! + Exp, 
the following are equivalent. 


Ba Vn 
2. The Sacks splitting theorem. 


Pushing finite injury constructions of the “bounded” type down into [xt is not 
difficult in general. Suppose we are running such a construction in a model M of 
RCAo. Thus we have requirements and priorities as above (indexed now by elements 
of M), and we also have a xt definable function f: M — M such that M satisfies 
that f(i) bounds the number of times ; acts during the course of our construction. 
In this case, for each i € M, the set J = {(j,n) : j <i AR; acts at least n times} is 
x definable and bounded in M by b =i: 3) ;<; f (J). 


By bounded x? comprehension (Proposition 6.2.5), J is M-finite, so the formula 
(V(j,n) € 1)(As)[R; acts n times by the end of stage s] 


is an instance of Bx! in M. Since M satisfies Bz? we may fix sq bounding the 
existential quantifier. It follows that no R; with j < i acts at any stage s > so, and 
we can now proceed as above to conclude that R; acts (and is permanently satisfied) 
at some stage 5; > So. 

The above argument does not work for finite injury constructions of the “un- 
bounded” type, and yet many still go through in 1x0. An example relevant to us is 
the following: 


Theorem 6.4.3 (Hajek and Kuéera [133], Mourad [224]). Over PAT +12) + BX? + 
Exp, the following are equivalent. 


Tet 
2. The low basis theorem. 


Note that our proof of the low basis theorem (Theorem 2.8.18) was actually a forcing 
construction (which we will discuss in depth in the next chapter) but can also be 
presented as a finite injury priority construction with an “unbounded” number of 
injuries to each requirement (see, e.g., [66], Section 3.4.3). 

In general, there are various methods for formalizing these kinds of finite injury 
constructions, most notably Shore blocking, originally developed in a-recursion 
theory by Shore [279], and later adapted for use in arithmetic by Mytilinaios [230]. 
We will mention an application of this technique in Chapter 9. But not all such 
constructions are known to be provable in Ix), and indeed, there are examples that 
require Bx?. A detailed analysis was undertaken by Groszek and Slaman [130]. 
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For completeness, we mention briefly also infinite injury arguments. Again, the 
most well-known examples here are c.e. set constructions. Recall the following 
computability theoretic notions. 


e Ac.e. set A is maximal if it is coinfinite and for every c.e. set B 2 A, either 
B=worB\ A=" @. 

¢ Two sets Ag and A; form a minimal pair if Ag, Ay €r @ and every B <7 Ao, Aj 
satisfies B <7 @. 


The existence of maximal sets and c.e. minimal pairs are both classical applications 
of the infinity injury method. And, fittingly, the induction needed to prove these 
results reflects the additional complexity of this method over finite injury. 


Theorem 6.4.4 (Chong and Yang [41]). Over RCAg + BEY, the following are equiv- 
alent. 

1. 128. 

2. There exists a maximal set. 


Theorem 6.4.5 (Chong, Qiang, Slaman, and Yang [38]). Over RCAg + Bxe, the 
following are equivalent. 


1,155 
2. There exists a minimal pair of c.e. sets. 


Reverse recursion theory is not limited just to priority constructions. Many other 
results from computability theory can be, and have been, analyzed in this framework. 
For example, the limit lemma (Theorem 2.6.5) can be carried out in just BE. (See 
Svejdar [319].) For a much more comprehensive overview, including sketches of 
some core arguments, we refer to the survey of Chong, Li, and Yang [36]. 


6.5 Hirst’s theorem and BL} 


By far the most commonly encountered bounding scheme in reverse mathematics is 
Be This may be partially attributed to the fact that many constructions in RCAg 
involve x and 18s formulas (a set being finite or infinite, a function being total, etc.). 
So, logical manipulations along the lines of Proposition 6.1.2, which are widespread, 
quickly cause Bx? to appear. We have also seen in Proposition 6.2.4 that some very 
elementary number theoretic facts—such as every x definable function with domain 
an initial segment of N having bounded range—can be equivalent to B=). Indeed, 
this may lead us to think there cannot be many cases where we can do without Bx?! 
More surprising, however, is that BxS—a first order principle, solely about prop- 
erties of numbers—can be equivalent to a second order principle—about sets of 
numbers. This realization was first made in the following seminal result of Hirst. 
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Theorem 6.5.1 (Hirst [154]). RCAo + RT! © BE. 


Proof. (—). We reason in RCAg and work with BIT? instead of Bx for convenience. 
Assume RT!. Let g(x, y) be a A formula such that for some z we have (Vx < 
z)(Ay) v(x, y). Write v(x, y) as (Vu) (u, x, y), where is zi. Now define a function 
f: N — Nas follows: for every a € N, 


_ | (ub < a)(Wx < z)(Ay < b)(Wu < a)w(u,x,y)  ifx exists, 
fla) = a otherwise. 
Note that f is x definable, and so exists. 
We claim that f has bounded range. Suppose not. Then by A° comprehension 
there is a sequence (a; : i € N) such that ag < a, <--- and f(ag) < f(a1) <---. 
Define c: N — z as follows: for every i, let 


c(i) = (ux < z)(Vy < f(ai)) (Au < ais) > (u, x, y). 


Then c exists and is total. Indeed, if there were an 7 such that (Vx < z)(Ay < 
f(aj))(Wu < ajz,)W(u,x, y) we would have f(ai+1) < f(a;) instead of f(a;) < 
f (ai1). So c is an instance of RT!, and we may consequently fix an infinite homoge- 
neous set H for it, say with color x < z. By hypothesis, there is a y such that y(x, y). 
But if we fix i such that f(a;) > y, then (Au < aj41)7W(u, x, y) by definition of c, 
and so in particular ay(x, y). 

Having proved the claim, let w bound the range of f. Thus, for every a we have 


(Vx < z)(Ay < w)(Wu < a)W(u,x, y). 


Fix x < z. Define d: N — w as follows: for each a, d(a) = (uy < w)(Wu < 
a)W(u, x, y). Then dis an instance of RT!, and so we may fix an infinite homogeneous 
set H for it, say with color y < w. Thus, there are infinitely many a such that 
(Vu < a)W(u, x, y), and so it follows that (Vu)w(u,x, y) and therefore that v(x, y). 
Since x was arbitrary, we conclude that (Vx < z)(Ay < w)y(x, y), as desired. 


(<—). Next, assume BIT? and let f: N > z bea given instance of RT!. Let v(x, y) 
be the mm formula (Vu)[u > y > f(u) # x]. Thenif f has no infinite homogeneous 
set, we have 


(Wx < z)(Ay) g(x, y). 
Fix w as given by BIT? applied to y. Clearly, we have 6(x, y) — @(x, y*) whenever 
y < y”, and so in particular 
(Vx < z)(Vu)[u >w— f(u) # x]. 
But this cannot be since f is a map into z. Thus, f must have an infinite homogeneous 
set and the proof is complete. oO 


The way Bx} is used in this proof is interesting. RT! admits computable solutions, 
so there is no sense in which Bx) is helping with the actual construction of an 
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infinite homogeneous set. Rather, it is used to expose the contradiction in a reduction 
assumption that no infinite homogeneous set exists. 

It is worth noting that Hirst’s theorem is unrelated to the issue of nonuniformity 
in picking an i such that f(x) =i for infinitely many x. The same issue occurs with 
RTj. for any standard k > 2, but RT; is provable in RCAg. (The same proof works, 
but the bound y* can be found directly.) More tellingly, the following principle is 
completely uniform—it uniformly admits computable solutions—yet it exhibits the 
same behavior as RT!. 


Definition 6.5.2 (Extraction of a homogeneous set from a limit homogeneous 
set). LH is the following statement: for every c: [N]? — 2 such that limy c(x,y) =1 
for all x, there exists an infinite set H such that c(x, y) = 1 for all (x, y) € [H]?. 


In Definition 8.4.1, we will define a set L to be limit homogeneous for a coloring 
c: [w]* > 2 if it has the above property with respect to all x € L. Every such set has 
a uniformly c 6 L-computable infinite homogeneous subset (see Proposition 8.4.2). 
But formalizing this cannot in general be done in fet The proof of the next result is 
left to Exercise 6.7.6. 


Proposition 6.5.3 (Dzhafarov, Hirschfeldt, and Reitzes [82]). Over RCAo, LH is 
equivalent to RT!. 


Hirst’s theorem makes Bx! in many cases easier to work with. An example is the 
following proposition, which reveals yet more guises of BX°, and which should be 
compared to Propositions 4.2.10 and 4.3.4. 


Proposition 6.5.4. Over RCAo, the following are equivalent. 


TBE, 

ZIRT. 

3. IPHP. 

4. General-IPHP. 
5. FUF. 


Proof. We argue in RCAg. That (1) — (2) is one half of Hirst’s theorem (Theo- 
rem 6.5.1). That (2) — (3) is clear. 

To prove (3) — (4), proceed as in the proof of Proposition 4.3.4. Suppose we 
are given an instance of General-IPHP. This is a sequence (X; : i € N) of subsets of 
N such that X; = @ for almost all 7 and for every x there is a y > x such that y € X; 
for some i. Form the set S = {(x, (i, y)) : y > x Ay € X;}. Now using LX®, define 
g: N > N as follows: g(x) = (i, y) for the unique (i, y) such that (x, (i, y)) € S$ 
and there is no (x,z) < (x,(i,y)) in S. Define f: N — N by f(x) = 7 where 
g(x) = (i, y). By assumption on (X; : i € N), the range of f is bounded. Hence, f 
is an instance of IPHP. By IPHP, there is an i such that f(x) =i for infinitely many 
x. Then X; is infinite. 

To prove (4) — (5), fix an instance of FUF. This is a sequence (F; : i € N) of 
finite sets such that F; = @ for almost all 7. If, for every x there were a y > x such 
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that y € F; for some then (F; : i € N) would be an instance of General-IPHP. Since 
we are assuming General-IPHP, we know that this would imply that for some i, F; 
is infinite. We conclude that there is an x such that for all y, if y € F; for some i then 
y<x. 

Finally, we prove (5) — (2) (which implies (1) by the other half of Hirst’s 
theorem). Let f: N — z be given and assume that no infinite homogeneous set 
exists. For each i, define F; = {x : f(x) = i} if i < zand F; = @ if i > z. Then 
(F; : 1 € N) exists, and is an instance of FUF. By FUF, there is a b such that for all 
x, if x € F; for some i then x < b. But this is impossible, since f(b) = i for some 
i < z, and hence b € R;. im 


Another example of the occurrence of BxS happens when a property of individual 
numbers is phrased instead as a property of an initial segment of the natural numbers. 
To illustrate this, recall the problem COH from Definition 4.1.6. As a mathematical 
principle, this states that for every sequence (R; : i € w) of elements of 2”, there 
exists an infinite set S' such that 


(Vi) (4u;)[(Wx > w;)[x € S > x € Ri] V (Vx > w;)[x Ee Sx € Ril]. 
Consider the following variant introduced by Hirschfeldt and Shore [152]. 


Definition 6.5.5 (Strongly cohesive principle). StCOH is the following statement: 
for every sequence R = (R; : i € w) of elements of 2”, there exists an infinite set S 
such that 


(Vz) (Aw) (Vi < z)[(Vx > w)[xESrxER]V (Vx >w)[xeSoxe Ril]. 


As problems, StCOH and COH are equivalent over w-models. As Va theorems, where 
z may be nonstandard, the equivalence is no longer obvious because {w; : i < z} 
could be unbounded. Indeed, there is provable difference between the principles. 


Proposition 6.5.6 (Hirschfeldt and Shore [152]). RCAg + StCOH > BR. 


Proof. We argue in RCAg + StCOH and derive RT!. Fix z € N and an instance 
c: N— zof RT!. Define an instance (R; : i € N) of StCOH as follows: for all x, 


Ree ( if c(x) =i, 


0 otherwise. 


By StCOH, fix an infinite set S and a w € N such that 


(Vi < z)[(Vx >w)[xESrxeR]V (Vx >w)[xeSoxe Ril]. 


Now say c(w+1) =i. Sincei < z, it follows that (Vx > w)[x € S > x € R;]. Hence 
{x € R; : x > w} is an infinite homogeneous set for c. oO 


By contrast, in Theorem 7.7.1 we will see that COH does not imply Bx? over RCAo. 
In fact, we will prove that COH is II; conservative over be Thus, what may seem like 
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a very casual maneuver in a proof—teplacing (Vi) (Aw) by (Vz) (Sw) (Wi < z)—takes 
additional inductive power. 

We mention one further example which uses a principle introduced by Hirschfeldt 
and Shore [152]. If X is an infinite set and <x a linear ordering of X, we say (X, <x) 
is of order type w + w* if the following hold. 


1. X has a <y-least element f and a <x-greatest element g. 
2. Forall x € X, exactly one of {y € X : y <x x} and {y € X : x <y y} is infinite. 


We say (X, <x) is strongly of order type w + w* if for every finite set 
{€ =xX0 <x +++ <x XE = 8} 
there exists exactly one i < k such that {y € X : x; <x y <x x;41} is infinite. 


Definition 6.5.7 (Partition principle). PART is the following statement: if <x is 
a linear ordering of N such that (N, <x) is of order type w + w*, then (N, <x) is 
strongly of order type w+ w*. 


It can be seen that PART follows from RCAg + Bo (See Exercise 6.7.4.) What is 
much harder, and indeed, was an open question for some time, is to see that the 
converse holds as well. 


Theorem 6.5.8 (Chong, Lempp, and Yang [34]). RCAg + PART © BxD. 


The proof, which would take us too far afield in this book, proceeds via an interme- 
diary step. First, a certain special kind of cut is defined, called a bi-tame cut. It is 
then shown that a model M satisfies BEy if and only if M has no bi-tame cut. This is 
a characterization very analogous to that of induction in Theorem 6.2.3. Separately, 
it is then shown that a model M satisfies PART if and only if M has no bi-tame cut. 


6.6 So, why X! induction? 


Having explored more of the universe of fragments of induction, we may well wonder 
why Ix? enjoys a privileged position in our investigation of subsystems of second 
order arithmetic. Even before our discussion, this question arises quite naturally. 
After all, on the view that RCAg is a formalization of computable mathematics, the 
restriction to Ae comprehension is clear, but the restriction to xt induction less so. 
There are two reasonable positions here. On the one hand, we may object to 
restricting induction at all. We may well commit ourselves to computable methods 
in our constructions but retain classical reasoning. Then if a property holds of 0, and 
of n + 1 whenever it holds of n, we can accept that it holds of all n regardless of its 
complexity. At the very least, it may seem reasonable to accept something stronger, 
like Bx). On the other hand, we may demand restricting induction even further. If, 
say, we take a constructivist approach, then concluding that a property holds for all 
n requires us to actually demonstrate this fact, for each individual n, constructively. 
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In this case, the complexity of the property in question matters very much and we 
may only find it reasonable to consider AY (or perhaps even lower) properties. 

Intuitively, the restriction to ze induction can be seen as a compromise between 
these two perspectives—that RCAg should be a formal system corresponding to 
computability theory, and that RCAg should be a formal system for the foundation 
of (a large amount of) mathematics. 

As to simplicity, we have already seen in Corollary 5.10.2 that RCAg is ny 
conservative over PRA, which is known not to be case with more induction added. 
It is certainly satisfying that the first order part of our base theory should line 
up with such an old and well-known metamathematical system. Another measure 
of simplicity is via ordinal analysis. The proof theoretic ordinal of a theory T of 
arithmetic is the least ordinal number a that T cannot prove to be well ordered. We 
recommend Pohlers’ survey article [249] for more background. The proof theoretic 
ordinal of RCAg is w, which is not overly large, and happens to be the same as that 
of PRA, PA” + boas and WKLo. The fact that theorems can be proven in these systems 
with restricted induction means that these theorems do not have exceptionally high 
consistency strength, which is of independent foundational interest. 

In terms of formalizing as much of mathematics as possible, we will see through- 
out this book that bi induction suffices more often than not. Admittedly, this chapter, 
particularly the preceding section, presents BxS as indispensable to many arguments. 
But these still make up only a minority within reverse mathematics. Often, even when 
a classical proof appears on its face to use more than x induction, there are ways to 
temper the proof to make it go through. (We saw examples in Section 6.4, and will 
see more later, e.g., in the proof of Theorem 8.5.1.) Of course, once we get at least 
to the level of ACAo, we have full arithmetical induction anyway, so the question is 
no longer relevant. 

To be sure, there is a rich theory dedicated to the question of what is and is 
not provable in systems where x? induction is weakened further, for example to 
[xp + Bx? + Exp, which we saw above. (The latter, when swapped for Ix? in RCAg, 
is sometimes denoted RCA).) For the purposes of studying “ordinary” mathematical 
theorems, these systems are quite limited. In Section 6.4, we saw several standard 
computability theoretic theorems that are equivalent to ia over PA™ +1) + Bx! +Exp. 
An even more basic such example is the result that every polynomial over a countable 
field has only finitely many roots. But for us, perhaps the most convincing along these 
lines is the following result on definitions by recursion. 


Theorem 6.6.1 (Hirschfeldt and Shore [152]). The following are equivalent over 
PA~ +15) + Bx). 
1. 129. 
2. For every z, every m, and every ba definable function f: N — N, there is a 
function g: m+1 — N such that g(0) = z and forall x < m, g(x+1) = f(g(x)). 


Thus [at is needed to carry out even finite recursion (using easy-to-define functions). 
Thus, this level of induction is quite important to maintaining a faithful connection 
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between RCAg and computable mathematics. In this sense, then, we see that x? 
induction is a kind of sweet spot, neither too weak nor too strong. 

One final aspect of the choice of pa induction is historical. As discussed above, on 
the whole there are relatively few “ordinary” theorems that are not provable in RCAg 
but are provable with additional induction. This is especially true for principles from 
analysis and algebra, which were the early focus of the subject. Indeed, the first major 
counterexample came from a different field—combinatorics, with Hirst’s theorem 
(Theorem 6.5.1). This theorem dates to 1987, by which point in time, of course, the 
theory RCAog was already well established. 


6.7 Exercises 


Exercise 6.7.1. Show that the following is provable in RCAg. If T C 2<" is an infinite 
binary tree, then for each f € N there exists an extendible 0 € T with |o| = €. 


Exercise 6.7.2. Let be a collection of formulas of £; or £2. The T strong induction 
scheme (SII) is the scheme consisting of all sentences of the form 


(Wa) Ly < x)e(y) > 9@)] > (Vx) eQ). 
for v(x) € T. Prove that for each n > 1, RCAg + IZ2 < SIZ? © SIT. 


Exercise 6.7.3. Show that the following is provable in RCAo. For every function 
f: N— Nand every w, there exists a z such that 


(Vy < w)[y € range(f) @ y € range(f [z)]. 
Exercise 6.7.4. Show that RCAy + BZ) — PART. 


Exercise 6.7.5 (Corduan, Groszek, and Mileti [56]). Fix n > 1 and suppose 
M & RCAo + BE? + 1Z?. Let J C M be a 22-definable proper cut of M. Say y 
isa EUs formula such that a € I @ M § (Aw)(Vz)y(a, w, z) for all a € M. Fix 
b € M\ J and define g: (M | b)x M — M by letting g(a, s) be either the <™-least 
w <™ s such that M & (Vz < s)y(a, w, z), or s if no such w exists. 


1. Show that g is A? _,-definable in M. 

2. Show that M F g is a total function on its domain. 

3. Show that if a € J then lim, g(a, s) exists. 

4. Show that a +> lim, g(a, s) is an unbounded function on J. 


Exercise 6.7.6 (Dzhafarov, Hirschfeldt, and Reitzes [82]). Prove Proposi- 
tion 6.5.3. 


Exercise 6.7.7. Prove there is a countable model of PA with an element that is not £° 
definable for any n € w. Hence ),, Kn(M) is a proper subset of M. (Much stronger 
results are possible; see Murawski [228].) 


Chapter 7 ®) 


Check for 


Forcing updates 


Forcing is a profoundly powerful piece of machinery in mathematical logic. It was 
invented by Cohen in the 1960s as the miraculous ingredient that allowed him to 
prove that the axiom of choice and the continuum hypothesis are both independent 
of ZF ([47, 48, 49]). Subsequently, forcing became a household tool in set theory, 
but not only there. 

Shortly after Cohen’s breakthroughs, Feferman [102] developed a version for use 
in arithmetic, which he used to construct a number of examples of exotic subsets 
of w, such as a finite tuple of sets each nonarithmetical in the join of the others 
(see Exercise 7.8.1). This paved the door for the widespread use of forcing in proof 
theory, model theory, and computability theory. In fact, the rudiments of forcing in 
computability theory have an even longer history, predating the work of Cohen by 
a decade. The basic ideas are already present in the finite-extension method, which 
was developed in the 1950s by Kleene and Post. 

Quintessentially, forcing is a framework for constructing objects with prescribed 
combinatorial properties. The systematic study of the effective properties of these 
objects was initiated by Jockusch [169] and continues to this day. Unsurprisingly, 
the technique has immense applications to reverse mathematics, as we explore in 
this chapter. 


7.1 A motivating example 


The basic premise of forcing is to construct an object G by a sequence of approxima- 
tions of some kind, called conditions, each of which ensures, or forces,a particular 
property about the G obtained in the end. In computability theory and arithmetic, the 
objects we look to construct are usually subsets of w, and the properties are usually 
arithmetical. 

A canonical example of this is the construction of a set by a sequence of longer 
and longer binary strings, the so-called finite extension method in computability 
theory. Here, our conditions are finite strings 0 € 2*“, regarded as initial segments 
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of the characteristic function of the eventual set G we want. We build a sequence 
Oo Xo, <--- of conditions and then set G = ); o;. During the construction we 
usually choose oj; carefully to satisfy various properties we want G to have. For 
example, if we want G to be noncomputable, then we might proceed as follows. We 
set Oy = (), and having defined o, for e € w, we let Oe4; = oe ~ (1 — ®(e)) if 
®.(e) le {0, 1}, and otherwise we let 0.4; = 0-0. In this way, even though we have 
only determined a small initial segment of G so far, 741, we already know that G 
(whatever it will eventually be) will satisfy that G # ®g. It is in this sense that we 
can say that 0241 forces G # Bg. 

That this is possible is not really surprising. After all, to ensure G differs ®, 
requires only one bit of information, which is present in a finite initial segment. But 
the power of forcing comes from the ability to use conditions to similarly force more 
general properties, including infinitary ones that seemingly do require knowing all 
of G. This is far less obvious. 

To illustrate this, let us stay with the example of constructing a noncomputable 
set G by initial segments. Critical in the finite extension argument is that, for each e, 
we can always find a string forcing G # ®¢-, no matter which initial segment of G 
we may have determined up to that point. For this reason, we say the collection of 
strings forcing G # ®, is dense: at every stage of the construction, there is always 
an opportunity to meet this collection (i.e., to extend to some string in it). 

We presented the construction in a rather structured way, as is typical in com- 
putability arguments: we ensured o,) forced ®, # G for each e. We did not actually 
need to do this: we would have been content for any o; to force ®, # G, not neces- 
sarily i = e+ 1. The only thing we actually need to ensure is that the eventual G does 
not deliberately avoid all of the opportunities to force ®. # G for each fixed e. We 
call a construction of this kind generic, meaning roughly that any property that it is 
always possible to satisfy will eventually be satisfied. If one considers the negation 
of this statement, it is perhaps more intuitively clear that a construction that is not 
generic is special or “atypical” in a certain sense (see also Figure 7.1). We will make 
this intuition precise in this chapter. 

Conditions do not need to be finite strings, which are “approximations” in perhaps 
the most obvious sense. A different example is the construction of a path through 
an infinite computable tree. We have already seen this in action, in the various basis 
theorems we proved in Section 2.8. The setup there was always the same: given a 
computable tree 7, we obtain G € [7] through a nested sequence of infinite com- 
putable subtrees of 7. Each such subtree “approximates” more of G, in the sense that 
it determines a longer initial segment of G, and whittles down the space of possible 
extensions of this initial segment (i.e., the space of possible elements G of [7] that 
we could eventually produce). In the parlance of the preceding paragraph, then, the 
conditions were the computable subtrees of T. If we consider, for example, the cone 
avoidance basis theorem (Theorem 2.8.23), where we were given a noncomputable C 
and wanted to ensure that C <7 G, we see some of the same issues at work as above. 
In particular, although we indexed our conditions (subtrees) T = Ty 2 Tj; 2 --: in 
such a way that T.; forced C # ®%, we again only cared that the latter does get 
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Figure 7.1. An illustration of a non-generic set A € 2, represented by the thick line as a path 
in 2<. The solid dots represent initial segments of A; the hollow dots, strings o forcing a certain 
property not satisfied by A. Thus, A appears to be avoiding all opportunities to force this property, 
a very specific (hence, “non-generic”’) behavior. 


forced, and not at all that it is forced by 7.4; or any other specific condition in the 
construction. So here, too, any G € [TJ] obtained generically would have worked. 

With some intuition in hand, let us pass to a more formal discussion. Throughout 
this section, it will be helpful to keep in mind the picture of a G € w® being 
constructed by a sequence of approximations of some kind, as in the previous two 
examples. 


7.2 Notions of forcing 


We begin by formalizing what, precisely, we mean by “approximation”. There are 
different ways to go about this. Most treatments agree on the intended examples and 
applications, but possibly differ in certain edge cases. Our approach here is loosely 
based on one of Shore [283]. 


Definition 7.2.1 (Notions of forcing). A notion of forcing (or just forcing) is a triple 
P = (P, <,V) where (P, <) is a partial order and V is amap P — 2<“ satisfying the 
following. 


1. If p,q € P with p < q then V(p) > V(q). 
2. For each n € w and q € P there is a p € P with p < gq and |V(p)| > 2. 
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The elements of P are called conditions; and if p,q € P satisfy p < q then p is said 
to extend (and be an extension of) g. We sometimes also say p is stronger than q, 
and q is weaker than p, in this case. The map V is called a valuation map, and for 
each p € P, the value V(p) is called the valuation of p. 


When dealing with more than one forcing notion, we may decorate the above terms 
by P for clarity, referring to P conditions, P extension, p <p q, P valuation map, etc. 

Every specific forcing notion we will work with will have a natural valuation map, 
and in these cases the notion may be identified with (and by) the underlying partial 
order. We will thus usually only mention valuations explicitly in definitions and in 
proving facts about forcing notions in general. 


Example 7.2.2. Cohen forcing is the forcing whose conditions are finite strings 7 € 
2<“, extension is string extension, and valuation is the identity. 


In the context of forcing, we will usually refer to strings as Cohen conditions. It 
is worth emphasizing that if t is a Cohen condition extending o then tT < o but 
o XT (i.e., the orderings are reversed). This can be confusing, and for this reason we 
usually stick to the latter ordering in the context of Cohen forcing. This highlights 
a good question, which is why extension is defined to be < instead of > anyway. 
The reason is that extension is meant to represent that if g < p, then g approximates 
more of G than p, and hence there are fewer possible ways to “complete” q to G. 
We can make this explicit as follows. 


Definition 7.2.3 (Constructible objects). Let P = (P, <, V) be a notion of forcing. 
An element f € 2 is/can be constructed by P if there is a sequence of conditions 
Po 2 pi 2 -:: such that lim;_,.. |V(p;)| = co and f = U;<,, V(pi). In this case we 
also say f is constructed via the sequence py > pi 2°°-. 


We refer to f above as an object in general, because depending on the forcing notion 
we may be viewing it as (the characteristic function) of an object other than just a 
subset of w (see Example 7.2.6). When this is not the case, we will refer to f as a 
set or real, as usual. 

Every element of 2 can be constructed by Cohen forcing. We may well wonder, 
then, why we need forcing at all and cannot simply stick to binary strings and finite 
extension arguments. The answer is that different notions of forcing help us better 
control various aspects of the sets they can construct. To illustrate this, we first give 
examples of two important forcing notions. 


Example 7.2.4 (Jockusch and Soare [173]). Let T be an infinite binary tree. 
Jockusch—Soare forcing with subtrees of T is the following notion of forcing. 


¢ The conditions (called Jockusch—Soare conditions) are pairs (U,m) where m € 
w and U is an infinite subtree of T having a unique node of length m. 

¢ Extension is defined by (U*,m*) < (U,m) if U* ¢ U and m* > m. 

¢ The valuation map V takes a condition (U, m) to the unique o € U of length m. 
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Notice that (7, 0) is always a condition, and in fact, it is the <-greatest (i.e., weakest) 
condition. Likewise if f is any path through T, then ({o € 2° : 0 < f},n) isa 
condition for every n. The objects that can be constructed by Jockusch—Soare forcing 
with subtrees of T are thus precisely the paths through T. 


Example 7.2.5 (Mathias [206]). Let X be an infinite set. Mathias forcing within X 
is the following notion of forcing. 


¢ The conditions (called Mathias conditions) are pairs (E, 1), where EF is a finite 
set, / is an infinite subset of X, and E < J. 

¢ Extension is defined by (E*,/*) < (E,NDifECE* CEUJand’* Cl. 

¢ The valuation map V takes (£,/) to the string 0 € 2“® of length min/ with 
o(n) = 1 if and only ifn € E. 


The sets J are called reservoirs, and if (E*,I*) < (E,J) and J \ I* is finite then 
(E*, I*) is called a finite extension of (E, 1). 

Here, (@, X) is the <-greatest condition. The objects that can be constructed by 
Mathias forcing within X are precisely the subsets of X. 


In both examples above, when the given tree T or infinite set X is not relevant to a 
discussion, we may refer simply to Jockusch—Soare forcing or Mathias forcing. So a 
Jockusch—Soare condition is a condition of Jockusch—Soare forcing with subtrees of 
T for some T, etc. We include one more illustrative example. 


Example 7.2.6. Let P be the notion of forcing whose conditions are functions a: n > 
w for some n € w, with a*: n* — w extending a if n* > n and a”* extends a as a 
finite function, and with the valuation of a: n — w being the shortest string 7 € 2*@ 
such that (x,a(x)) < |o| for all x < |@| and o(y) = 1 if and only if y = (x, a(x)) 
for some such x. The objects that can be constructed by P are the (characteristic 
functions of) all functions f: w > w. 


By contrast, not every object (set) constructed by Cohen forcing (restricted to 
2<“”) needs to be a path through a given tree 7, a subset of a given set X, or (the 
characteristic function) of a function w — w. More importantly, no single Cohen 
condition can ensure any of these properties either. For example, outside of trivial 
cases (e.g., when T has paths extending every finite binary string, or X =" w), every 
Cohen condition 0 € 2*@ will have an extension T such that no G > T is a path 
through T or a subset of X. And, quite possibly, we may need to extend to such a t 
for the sake of satisfying some other property we want G to have. 

In the next section, we will see that this situation, of every condition having 
an extension of a particular kind and our needing to pass to such an extension, is 
actually crucial in forcing constructions in general. So the issue here is not just an 
inconvenience. 

In the sequel, we will look not just at sequences of conditions but at filters of 
conditions, which are sets of conditions that are consistent in a certain sense. 
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Definition 7.2.7 (Filters). Let P = (P, <,V) be a notion of forcing. 
1. A filter is a subset F of P satisfying the following properties. 


¢ (Upward closure): If q « F andg < p thenp € F. 
* (Consistency): If p,q € F there exists r € F such thatr < p andr < q. 


We write V(F) for Uper V(p). If V(F) € 2°, we call it the object (or set or 
real) determined by F. 
2. If po > pi > ++: is a sequence of conditions then 


{q €P: (i)[¢ > pil} 
is a filter, called the filter generated by the sequence of the p;. 
We will also find the following definition useful. 


Definition 7.2.8. Let P = (P, <, V) be a notion of forcing and K a property defined 
for filters. 


1. A sequence of conditions po > p; > -:: has the property K if the filter 
generated by this sequence has the property K. 
2. G € w® has the property K if G = V(F) for some filter F with property K. 


Notice that a filter F determines an object if and only if it contains conditions 
with arbitrarily long valuations. In this case, it is easy to see that F contains a 
sequence of conditions po > p1 > -:: with V(F) = Uje. V(pi). So, in particular, 
the object determined by a filter F can be constructed by P. Conversely, if G can 
be constructed via a sequence of conditions pp > p; > --: then G = V(F) for the 
filter F generated by this sequence. Thus, for the purposes of constructing elements 
of w®, there is no difference between working with filters or sequences. Filters are 
more important in set theory, where the forcings tend to be more complicated and 
(countable) sequences of the kind we are considering do not always suffice. Filters 
were carried over to computability theory from set theory, but their continued use is 
mostly a matter of convenience, as it makes stating certain results easier. 


Remark 7.2.9. It is important to be aware that just because some condition p € P 
satisfies V(p) < G, this does not mean that G can be constructed via a sequence 
that contains p (or equivalently, that p is part of a filter that determines G). For 
example, suppose we are dealing with Mathias forcing, and that p is the condition 
(@,2” \ {G}). One obvious exception to this is Cohen forcing, where every G can 
be constructed via the sequence of all its initial segments. There, it is interchangeable 
to speak of o being an initial segment of G and o being part of a sequence via which 
G can be constructed. But in general, it is only the latter that properly captures the 
notion of a condition p being an “initial segment” of G. In Mathias forcing, p is part 
of a sequence via which G can be constructed if and only if p has the form (£, /) 
where E C G C E UI. In Jockusch-—Soare forcing, it is the case if and only if p has 
the form (U,m) where G is a path through U. 
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We conclude this section by looking at several more examples of forcing notions 
and the sets they can construct. One common practice is to restrict the domain of a 
notion of forcing. 


Example 7.2.10. Fix aset X. The following are important variants of Mathias forcing 
within X. 


1. If X is computable, then Mathias forcing within X with computable reservoirs 
is the restriction to conditions (E, 7) with J computable. 

2. If C 7 @, then Mathias forcing within X with C-cone avoiding reservoirs is 
the restriction to conditions (£,/) with C ¢y J. 

3. Mathias forcing within X with PA avoiding reservoirs is the restriction to con- 
ditions (£, I) such that J + @. 


In all cases, extension and valuation is as before, within the restricted set of condi- 
tions. We can also relativize these notions to any A € 2”, obtaining, e.g., Mathias 
forcing within X with A-computable reservoirs, or Mathias forcing within X with 
conditions (E,I) such that C <7 A @ I. (Note that in the latter example we do not 
say “C-cone avoiding reservoirs relative to A” or anything too cumbersome.) 


Example 7.2.11. Fix an infinite computable tree T ¢ 2<®. Jockusch—Soare forcing 
with computable subtrees of T is the restriction to conditions (U,m) where U is 
computable, with extension and valuation as before. Since the set of paths through 
an infinite computable tree forms a m1 class, this is also sometimes called forcing 
with mm classes (in this case, nm subclasses of [T]). 

A common application of this forcing is to trees T with no computable paths. For 
every infinite computable subtree U of such a T, there is necessarily a maximal m 
such that U contains a unique node of length m, and this m can be found computably 
from (an index for) U. In this case, then, it is customary to specify a condition (U, m) 
simply by U. 


We can also obtain new forcing notions from old ones by restricting the extension 
relation rather than the set of conditions. 


Example 7.2.12. Consider the forcing notion from Example 7.2.6. Modify the exten- 
sion relation as follows: a*: n* — w extends a: n — w only if n* > n and for every 
x < |a*| there is a y < |a@| such that a*(x) < a(y). Then the objects that can be 
constructed by this forcing are precisely the (characteristic functions of) functions 
w — w with bounded range, i.e., the instances of IPHP or RT!. 


Another useful method combines two or more forcing notions into one in parallel. 


Definition 7.2.13 (Product forcing). Let J C w be finite or infinite, and for each 
i € I, fix a forcing notion P; = (P;, <;, V;). The product forcing [];<, P; is the notion 
of forcing defined as follows. 


1. The conditions are all tuples (p; : i € F) € [],er Pi, where F is a finite initial 
segment of J (i.e., F C J, andifi¢ F and j € J with j <ithen/ € F). 


182 7 Forcing 


2. A condition (p; : i € F*) extends a condition (p; : i € F) if F © F* and 
P; < pi forallie F. 

3. The valuation takes a condition (p; : i € F) to (V;(p;) : i € F), ie., to the 
string 0 € 2< of length |F|-m, where m = min{|V;(p;)| : i € F}, and 
ao (i,x)) = V;(p;)(x) for alli € F and all x < m. 


If P; = P for all i, we write [],-; P for [];<7 P;. If / is finite, the definition is 
sometimes modified so that F above is always just J. 


For example, if C denotes Cohen forcing we can consider the product forcing [];<,, C. 
It is easy to see that the sets that can be constructed by [];<,, C are all sequences 
(X; : i € w) of elements X; € 2”. Equivalently, we can view these as instances of 


RT or of COH, if we wish. By contrast, if P is the forcing from Example 7.2.12 then 
we can think of the objects that can be constructed by [],<,, P as being the instances 
of IPHP. 


7.3 Density and genericity 


Finding the right partial order and valuation map is the first step in a forcing con- 
struction. The next step is to actually force the desired object to have the properties 
we want by carefully selecting conditions to add to a sequence via which the object 
is constructed. We cannot do this arbitrarily. For example, if we wish to add a condi- 
tion from some set S, we must not have previously added a condition p that has no 
extension in S! This would seem to require knowing at each “stage” the totality of 
the sets we will wish to add from in the future, a daunting prospect. We avoid this 
complication by choosing conditions from dense sets. 


Definition 7.3.1 (Density). Let P = (P, <, V) be a notion of forcing. 


1. Aset D C P is dense if for every p € P there isa q € D such that g < p (ie., 
every condition has an extension in D). 

2. Given po € P, aset D C P is dense below po if for every p € P with p < po 
there is a g € D such that q < p. 

3. Aset F C PmeetsasetS C Pif FOS # ®; F avoids S if there is a p € F so 
that no q < p belongs to S. 


Thus, in the above parlance, no dense set of conditions can be avoided by any set 
of conditions, and in particular, if we wish we can always add a condition to our 
sequence from any given dense set. 

We now come to the main definition through which the construction of an object 
by forcing proceeds. 


Definition 7.3.2 (Genericity). Let P = (P, <, V) be a notion of forcing. Let D be a 
collection of dense subsets of P. A filter F is D-generic (or generic with respect to 
D) if it meets every D € D. 
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When D is understood or implied, we may call a filter or element of w® simply 
generic instead of D-generic. When we need to specify P, we may also say a filter or 
element of w® is D-generic for P. For some common forcings, like Cohen forcing 
or Mathias forcing, we may also speak of Cohen generics or Mathias generics, 
etc. Using Definition 7.2.8, we obtain also a notion of a descending sequence of 
conditions being D-generic, as well as of an element of w” being D-generic, and 
we use the same conventions in speaking about these. 

The definition of a sequence po > pi 2 -:: being D-generic is weaker than 
saying that it actually meets (as a set of conditions) every D € D. This is only 
equivalent provided each D is closed downward under <, ie., if p € D and q < p 
implies g € D. This will be the case in virtually all cases of interest to us here, but 
we maintain the distinction for generality. 

The existence of generic filters follows from a result originally proved by Rasiowa 
and Sikorski [254]. The proof of the version here is quite simple, but it is arguably 
the most important general result in the theory of forcing. 


Theorem 7.3.3 (Rasiowa and Sikorski [254]). Let P = (P,<,V) be a notion of 
forcing, p acondition, and D a countable family of dense subsets of P. There exists 
a D-generic filter containing p. 


Proof. Fix p € P. As D is countable, we may enumerate its members as Do, Dj,.... 
Let po = p, and given p; € P for some i € a, let p;+; be any extension of p; in D;, 
which must exist since D; is dense. The filter generated by the p; is D-generic and 
contains p. oO 


Multiple countable collections of dense sets can be combined, and the Rasiowa— 
Sikorski theorem can be applied to produce a filter generic with respect to them all. 
This means that forcing constructions are often modular, divided into parts, each of 
which specifies a particular collection of dense sets of conditions. A generic filter is 
then chosen at the end. 


Convention 7.3.4 (Filters determine objects). By definition, in any forcing notion 
(P, <, V) the set of conditions p with |V(p)| > n is dense for every n. We adopt the 
convention that all generic filters are assumed to be generic also for this collection 
of dense sets. Therefore, every generic filter F determines a generic object, i.e., 
V(F) = Uper V(p) is actually an element of w®. 


More generally, we use the adjective sufficiently generic to refer to filters generic 
with respect to all collections of dense sets we have specified explicitly, as well as 
any that, perhaps, show up frequently enough in a particular discussion for us to take 
as implied. 


Definition 7.3.5 (Sufficient genericity). Let P be a notion of forcing and K a class 
of filters. We say every sufficiently generic filter belongs to K if there is a countable 
collection of dense sets D so that every D-generic filter belongs to K. 


Again, we obtain also a notion of sufficient genericity for descending sequences of 
conditions and elements of w®. For example, our initial discussion in Section 7.1 
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makes it evident that every sufficiently generic set for Cohen forcing is noncom- 
putable. Let us consider some further examples. 


Example 7.3.6. Recall that a set Y € 2° is high if Y’ >7 0”, or equivalently, if there 
is a Y-computable function that dominates every computable function. Fix any set X, 
and consider Mathias forcing within X. For each e € w, the collection of conditions 
(E, 1) with the following property is dense: if ®, is a total and increasing function 
then peu (x) > B(x) for all x > |E|. Indeed, suppose ®, is a total and increasing 
and (E, /) is any condition, say with J = {x9 < x1 <-:-}.Letl* = {x@,(i) 21 > |E|}. 
Then (E, /*) is an extension of (£, /) of the desired kind. Thus, if G is a sufficiently 
generic set then pg will dominate every computable function, and therefore G will 
be high. (Formally, we could let Dz be the set of all conditions (£, 1) as above, and 
then formulate our claim for all {D, : e € w}-generic sets.) 


Example 7.3.7. Let C be anoncomputable set. Consider Mathias forcing with C-cone 
avoiding reservoirs. For each e € w, the collection of conditions (E,/) with the 
following property is dense: there is an x € w such that either ®! (x) |# C(x) or 
@£VF (x) T for all finite F ¢ J. (Indeed, suppose (E, J) has no extension with this 
property. By the failure of the second possibility, for each x there is a finite F ¢ I 
such that OEY (x) |, and by the failure of the first possibility, the value of this 
computation must be C(x). Thus, J can compute C, contrary to the fact that (E, J) 
is C-cone avoiding.) If G is a sufficiently generic set, it follows that G Zr C. 


Example 7.3.8. Consider Mathias forcing (within w) with PA avoiding reservoirs [. 
Then for each e € w, the collection of conditions (E£, /) satisfying the following 
property is dense: there is an x such that either Bf (x) |= ®,(x) | or BEY (x) [= 
y > y ¢ {0,1} for all finite F C J. (If not, there would be a condition (£, /) such 
that J computes a 2-valued DNC function, which would mean J > @.) If G is a 
sufficiently generic set, it follows that G computes no 2-valued DNC function, and 
hence that G + @. 


It is easy to check that, in each of the above examples, the extension (E*, J*) 
of a given condition (£,/) in the relevant dense set satisfies /* <r J. Hence, the 
constructions can be combined. For example, if we force with C-cone avoiding 
conditions, we can combine Examples 7.3.6 and 7.3.7 to obtain a generic G that is 
both high and does not compute C. 

We can also obtain computability theoretic facts about various instance-solution 
problems, as in Chapter 4. 


Example 7.3.9. Take any instance of the problem COH defined in Definition 4.1.6. 
This is a sequence R = (R; : i € w) of sets. Fix any set X, and consider Mathias 
forcing within X. For eachi € w, the set of conditions (E, /) satisfying the following 
property is dense: either J € R; or I C R;. Likewise, for each i, the set of condition 
(E, 1) with |E| > 7is dense. It follows that any sufficiently generic set G is R-cohesive 
and infinite, i.e., a COH-solution to R. 


In this proof, the extension (E*, /*) of a given condition (E, /) with J € R; or] C Ri 
satisfies 1* <p R ® I. Suppose R > @ and consider Mathias forcing with reservoirs 


7.3 Density and genericity 185 


I such that R © I > ©. We can then relativize Example 7.3.8 to R, and combine it 
with Example 7.3.9 to obtain a set G such that G is R-cohesive and ROG > &.Thus 
we have a slicker (and more flexible) proof of Proposition 3.8.5 that COH admits PA 
avoidance. 

We conclude this section by justifying the intuition, alluded to in Section 7.1, that 
generic sets are “typical” in a certain sense. That sense is topological. (This is not a 
result we will need in the rest of our discussion, but it is of independent interest.) 


Definition 7.3.10. Let P = (P, <, V) be a notion of forcing. 


1. For p € P, [[p]] is the set of all f € w® that can be constructed by P via a 
sequence containing p. 

2. Fp © w® is the space of all sets that can be constructed by P, with topology 
generated by basic open sets of the form [[p]| for p € P. 


Note that if C denotes Cohen forcing then Yc is just 2° with the usual Cantor space 
topology, with [[o]] for 7 € 2<® referring to the usual basic open set. 

If g < p then [[q]] © [[p]]. This is because any sequence containing g and 
used to construct some f € [[g]] can be truncated above q, and g replaced by p, 
to produce another sequence constructing f. From this, it is readily observed that if 
D C Pis dense (in the sense of forcing) then U, ep [[p]] S Fe is dense open (in the 
sense of the P topology). 

We will need one technical assumption on our forcings. 


Definition 7.3.11 (Filter valuation property). A notion of forcing P = (P,<,V) 
satisfies the filter valuation property (FVP) if for each f € w®, the set of condition 
p with f € [[p]] is a filter, or equivalently, if for all p,g € P with f ¢€ [[p]] 9 [[g]], 
there is anr € P extending both p and q with f ¢€ [[r]]. 


It is easy to see that, in addition to the obvious Cohen forcing, all examples of forcing 
notions we considered in the preceding section satisfy FVP, as will all other forcings 
we consider in the sequel. 

Recall now that a property of the members of a topological space is called generic 
if it holds almost everywhere in the sense of category. That is, a property is generic if 
it the elements of which it holds form a comeager set (a set containing an intersection 
of dense open sets). The result here is that this notion of genericity accords nicely 
with the one from forcing: namely, any property enjoyed by all sufficiently generic 
objects for a forcing satisfying FVP is generic in the topological sense. 


Proposition 7.3.12. Let P = (P, <,V) be a notion of forcing satisfying FVP. Then 
for any collection D of dense subsets of P the set of D-generic functions is comeager 
in the P topology. 


Proof. We may assume that for each n, the set of all p € P with |V(p)| > n is an 
element of D. For each D € D, let Up = UpenI[p]], which is a dense open set 
in the P topology, as discussed above. We claim that every element of ()pen Up is 
D-generic, which yields the result. Fix f € (pep Up and let F be the set of p € P 
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with f € [[p]]. Since P satisfies FVP, F is a filter. Moreover, F meets every D € D 
by choice of f, so F is D-generic. By our assumption on D, F contains conditions 
with arbitrarily long valuations, so f = V(F). Oo 


Corollary 7.3.13 (Jockusch [169]). The collection of Cohen generic sets is comea- 
ger in the usual Cantor space topology. 


7.4 The forcing relation 


There are times when the properties we wish to ensure of our eventual generic are 
difficult to express in the manner illustrated by our examples up to this point, as 
explicit dense sets defined in terms of the forcing notion. We therefore introduce a 
more systematic approach for doing so. To this end, we need to connect forcing with 
the language we use to talk about generic objects. 

Formally, our forcing language is £2 augmented by a new constant symbol G 
intended to denote the generic object. We denote this language by £2(G). Every 
formula here is thus of the form y(G, xX ,x) for some formula y(X, x ,x) of Lo. 
When we wish to say that G does not occur in a formula we state explicitly that it is a 
formula of £2. Otherwise, we are free to follow the same conventions and notations 
in £2(G) as we did in L5. In particular, we say a formula of £(G) is 2° or IT® if it 
has the form y(G) for some X? or II° formula y(X) of Lo. 

In what follows, let M@ be the full w-model (w, P(w)) and say a sentence of L 
is true if it is satisfied by M®. 


Definition 7.4.1 (Forcing relation). Let P = (P,<, V) be a notion of forcing. Let 
p be any condition, and ¢ an arithmetical sentence of £2(G). We define the relation 
of p forcing y, written p It y, as follows. 


1. If y is an atomic sentence of £2 then p | if y is true. 

If gist €G fora term then p + yift™” < |V(p)| and V(p)(t™”) = 1. 

.If g is (Ax < t)w(x) for a term ¢ then p  w if there is an m € w such that 
m <™® (M® and p t W(m). 

4. If piswo VW thenpt gifpt worpr yy. 

5. If y is a=W then p t gif q ¥ w forallg < p. 

6. If y is (Ax)W(x) then p I g if there is an m € w such that p Ik w(m). 


Why 


We say p decides if p t y or p It 7y. 


We could also start with a collection 8 of subsets of w and formulate our dis- 
cussion over £ (8) (as defined in Definition 5.1.3) instead of £2. Thus our forcing 
language would be £2(8)(G), and the definition of would have part (1) modified 
to apply to atomic formulas £2(8). We will not explicitly phrase things in this 
generality, but everything should be understood as “relativizing” in this sense. 

The forcing relation behaves a bit like the satisfaction predicate for structures, but 
not in all ways. For starters, a condition need not decide a formula y, and so it may 
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not force the tautology y V -y (see Exercise 7.8.4). What we are calling the forcing 
relation here is properly called strong forcing, and should be compared with weak 
forcing, which is more common in set theory. The definition of the latter, which we 
denote by |+,,, differs in its treatment of disjunctions and quantifiers. 


* D ltyw Wo V Wy if for every q < p there is r < qg such thatr I, Wo orr ty Wy. 
° Dp Fy (Ax)w(x) if for every g < p there is r < g andn such thatr ty w(n). 


With these modifications, it is not difficult to see that if p  y then p I-y y, but not 
necessarily the other way around. Notably, every condition does weakly force every 
tautology. However, it is also true that the two relations agree up to double negation: 
D ty ¢ if and only if p  -7 (see Exercise 7.8.5). Since semantically y and ~7=y 
are equivalent, the choice of which forcing to use is actually inconsequential for 
deciding properties about generic objects. But strong forcing enjoys an important 
advantage over weak forcing for computability theory applications, which is that 
it has a simpler definition. As we describe in the next section, this means that 
constructions employing strong forcing can be carried out more effectively. 
We collect some basic facts about the forcing relation. 


Proposition 7.4.2. Let P = (P,<,V) be a notion of forcing, p a condition, and yp 
and w be arithmetical sentences of £2(G). 


1. If p \t y then p t a7. 

2. If p + a7 then some q < p forces ¢. 

3. If pt py Aw then some q < p forces both y and w. 

4. p cannot force both ~ an 79. 

5. If pt pand q < p theng k ¢. 

6. If p is a sentence of £2 and p decides y, then p \ if and only if ¢ is true. 
7. The collection of conditions that decide ¢ is dense. 


Proof. Part 4 follows immediately from the definition of forcing negations. Part 5 is 
proved by induction on the complexity of y, using the monotonicity of the valuation 
map in the case where ¢ is atomic. The other parts are left to Exercise 7.8.6. oO 


The last part above is what connects the forcing relation to genericity. The relevant 
definition is the following. 


Definition 7.4.3 (n-genericity). Let P = (P, <, V) be a notion of forcing andn € w. 
A filter F is n-generic if every £2 sentence y of £L5(G) is decided by some p € F. 


The notion also relativizes. If we work with £2(8)(G) for some collection B of 
subsets of w, we get a corresponding notion of n-generic relative to B, or n-B- 
generic for short. We say n-B-generic instead of n-{B}-generic. 

We can relate this genericity to our previous one (Definition 7.3.2) as follows. 
Enumerate as go, ¢1,... all = sentences of £2(G). Then, for each e, define 


De={pEP: pl gv or pt Age}, 
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and let D = {D. : e € w}. Then D is a collection of dense sets by Proposi- 
tion 7.4.2 (7), and being n-generic is precisely the same as being D-generic. In 
particular, we have the following as an immediate consequence of the Rasiowa— 
Sikorski theorem. 


Theorem 7.4.4 (Existence of n-generics). Let P be a notion of forcing, p a condition, 
and n € w. There exists an n-generic filter containing p. 


(In the relativized case, we get the existence of an n-8-generic so long as G is a 
countable collection of subsets of w.) 

Clearly, every (n + 1)-generic is n-generic, for all nx. We also have the following 
basic facts. 


Proposition 7.4.5. Let P = (P,<,V) be a notion of forcing, n € w, and F an 
n-generic filter. If yp is a 11° sentence y of L(G) then some p € F decides ¢. 


Proof. Fix a =° sentence w so that y is sw. Then some p € F must decide w. If p 
forces sy then this is g. If, instead, p forces w then by Proposition 7.4.2 p forces 
any, which is any. O 


With this in hand, we can state and prove the central result of this section, which 
relates forcing and truth. 


Theorem 7.4.6 (Forcing and truth). Let P = (P,<,V) be a notion of forcing, 
n € w, F ann-generic filter determining the object G, and y(X) a &° or T° formula 
of £2. Then y(G) is true if and only if some p € F forces y(G). 


Proof. We proceed by induction on the complexity of vy. If y is an atomic sentence 
of £5, then it is either true, and then forced by all conditions, or false, and then 
forced by none, and this does not depend on G. Next, suppose gy is atomic and of 
the form t € X for some term t. This is satisfied by G if and only if G(t”) = 1, 
hence if and only if there is a p € F with |V(p)| > t™” and V(p)(t™”) = 1, which 
precisely means that p It y(G). 

It is clear that if y is Wo V yy, and the theorem holds of each of Wo and YW then it 
also holds of y. Suppose now that y is —W and that the theorem holds of yw. Notice 
that w is T° or X°, depending as y is 2° or IT°, respectively. Since F is n-B-generic, 
some condition in F' decides w(G) (in the former case, by Proposition 7.4.5). Now 
y(G) is true if and only if #(G) is false, which is the case if and only if no condition 
in F forces w(G) by hypothesis. Thus, g(G) holds if and only if a condition in F 
forces =W(G) = y(G). Thus, the theorem holds of y. 

Next, suppose y(X) is (Ax < t)w(X,x) for some term t. We have that y(G) 
holds if and only if y(G,m) holds for some m < t™”. By hypothesis, this is the 
case if and only if there is a p € F and an m < t™” such that p t w(G,m), which 
means exactly that there is a p € F that forces (Ax < f)W(G,x) = y(G). The case of 
y(X) = (Ax)w(X, x) is handled similarly. This completes the proof. ia 


Being able to write down the properties we want to decide about our generic 
object in the forcing language is convenient. In many cases, it eliminates the need to 
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think too much about the particular forcing notion we are working with. We simply 
take a sufficiently generic object and appeal to Theorem 7.4.6. But in other situations, 
particularly when we care about effectivity of the generic object, we wish to find an 
n-generic for as small an n as possible, and in these cases we must be more careful. 


Example 7.4.7. Consider Cohen forcing, and take the sentence 
(Ve)(Ax)[®e(x) T V®e(x) L# G(a)]. 


This is Tie, and so as remarked above, every 3-generic filter contains a condition that 
decides it. But we know the negation of this sentence cannot be forced: for all e € w 
and all conditions o, if x > |o| and ®.(x) | there is at > o with ®,(x) # T(x). 
Hence, by Theorem 7.4.6, every 3-generic set is noncomputable. One way to improve 
this is to instead take, for each e € w, the sentence 


(Ax) [®e (x) T V@e(x) L# G(x)]. 


This is ye hence every 2-generic filter contains a condition that decides it, and 
indeed, by the same argument as above, forces it. We thus conclude that already 
every 2-generic set is noncomputable. But now consider the sentence 


(Ax) [®e(x) |# G(x)]. 


This is = hence every 1-generic filter contains a condition that decides it. It is no 
longer the case that the sentence is necessarily forced, but if its negation is forced 
then our argument actually shows that so is (Ax)[®, (x) T]. Hence, in any case, every 
1-generic set G still satisfies that G # ®, for all e, and so is noncomputable. 


Example 7.4.8. Consider Mathias forcing (within w), and for each e € w, take the 
sentence 
(Am)(Wx > m)[®e(x) T Vpe(x) = Be(x)]. 


This is = and so by the argument in Example 7.3.6, every 2-generic filter must 
contain a condition that forces it. Hence, every Mathias 2-generic set is high. But 
here, this cannot in general be improved to 1-generics (see Exercise 7.8.11). 


7.5 Effective forcing 


In computability theory, forcing constructions are particularly useful when we can 
gauge their effectiveness. In broad terms, this allows us to conclude that not only can 
sufficiently generic objects always be found, but that they can be found Turing below 
certain oracles. To this end, we first need to discuss effective notions of forcing. 


Remark 7.5.1. In this section, we assume all forcing notions to be countable, and 
represented by subsets of w. There can of course be different representations in this 
sense. For example, in Mathias forcing with computable reservoirs we could identify 
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a condition (£,/) with the pair (e,i), where e is a canonical index for E, and i is 
either a A index for J ora Ph index for J (i.e., J = ®; or J = W;). These kinds of 
distinctions will typically not affect anything in our discussion, but when they do, we 
will be explicit about them. Otherwise, we will usually refer simply to conditions, 
extensions, etc., when of course we really mean the underlying sets of codes, as per 
Convention 3.4.5. 


Definition 7.5.2. Fix A € 2”. A notion of forcing P = (P, <, V) is A-computable if 
P is (represented by) an A-computable subset of w, < is an A-computable relation 
on A, and V is an A-computable map P > 2<”. 


Thus all forcings here are computable in the join of P, <, and V. Obviously, Cohen 
forcing is computable. We give two other examples. 


Example 7.5.3. Let X be an infinite computable set, and consider Mathias forcing 
within X with computable reservoirs (Example 7.2.10). The set of conditions is then 
I. The extension relation is m1 relative to pairs of conditions, and the valuation 
map is computable in the conditions. It follows that this forcing is @’’-computable. 


Example 7.5.4. Let T ¢ 2<® be a computable infinite tree, and consider Jockusch— 
Soare with computable subtrees of T (Example 7.2.11). The set of conditions here is 
m1’, as is the extension relation. The valuation map is computable in the conditions. 
This forcing is therefore @’-computable. 


We now move on to effective forms of density. 


Definition 7.5.5 (Effective density). Fix A € 2” and let P = (P, <, V) be a notion 
of forcing. 


1. Asubset D of P is A-effectively dense if there is a partial A-computable function 
f:@— wsuch that f(p) | for every p € P and f(p) < pand f(p) € D. 

2. A collection {D; : i € w} of subsets of P is uniformly A-effectively dense (or, 
for short, D; is uniformly A-effectively dense for each i) if there is a partial 
A-computable function f: w x w — w such that f(p,i) | for every p € P and 
i€w,and f(p,i) < pand f(p,i) € Dj. 


If P is A-computable then for each i the set {p € P : |V(p)| > i} is A-effectively 
dense, and in fact, the collection of all these sets, across all i, is uniformly A- 
effectively dense. We can use this to get the following effective version of the 
Rasiowa-Sikorski theorem. 


Theorem 7.5.6 (Existence of effective generics). Fix A € 2”, let P = (P,<,V) be 
a notion of forcing, p a condition, and D = {D; : i € w} a uniformly A-effectively 
dense collection of subsets of P. There exists an A-computable D-generic sequence 
Po = Pi 2 °°: Such that po = p and pj+, meets Dj. 


Proof. Fix a partial A-computable function f witnessing that D is uniformly A- 
effectively dense. Let po = p, and given p; for somei € w, let pj4) = f(pi,f). OO 
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If we have two or more collections of subsets of P, each uniformly X-effectively 
dense, then we can interweave these to produce another such collection. In this 
way, Theorem 7.5.6 can be quite a powerful tool. Let us illustrate this for each of 
Jockusch-Soare forcing, Cohen forcing, and Mathias forcing in turn. We begin by 
revisiting the low basis theorem (Theorem 2.8.18). 


Example 7.5.7. Let T be an infinite computable tree T ¢ 2<“ and consider Jockusch— 
Soare forcing with computable subtrees of T. Notice that the set L; of all conditions 
(U,n) such that n > i is uniformly @’-effectively dense for each i. Now, for each e 
let Je to be the set of conditions (U,n) such that either ®Y(e) T for all o € U, or 
else ®Y(e) | for the unique o € U of length n. Then J, is uniformly @’-effectively 
dense for each e, which is the iterative step in the proof of the low basis theorem that 
we saw earlier. 

Let D = {D; : i € w}, where Do; = L; and D2;,, = L; for alli. By Theorem 7.5.6, 
we can find a @’-computable sequence of conditions (Ug, no) > (U1,1) 2 +++ such 
that (U2e+1,2e+1) € Je for all e, and such that f = Use, V((Ue, ne)) € 25%. As 
we know, f € [7]. We claim that f is low. Indeed, we have that e € f’ if and only 
if OY (e) | for the unique o € Ure4; of length n2¢4;. Since the generic sequence is 
@’-computable and e is arbitrary, f’ <7 2’. 


For another application, recall that a set X has hyperimmune free degree if every 
X-computable function is dominated by a computable function (Definition 2.8.19 
and Theorem 2.8.21). 


Proposition 7.5.8. Let T ¢ 2<® be an infinite computable tree, and consider 
Jockusch—Soare forcing with computable subtrees of T. For each e, let De be the set 
of conditions that decide the sentence (Wx)[®&(x) |] and let D = {De : e € wh. 


1. D is uniformly @" -effectively dense. 
2. If f € 2° is D-generic then f has hyperimmune free degree. 


Proof. We leave part (1) to Exercise 7.8.2. For part (2), fix e and a condition (U, m) 
having f as a path through it that decides the 1 sentence (Vx)[®S(x) |]. If (U, m) 


forces the negation, then of cannot be total by Theorem 7.4.6. So suppose instead 
that (U, m) Ik (Vx)[®$(x) |]. Then for each x the tree U, = {a € U : ®F (x) T} is 
finite, else (U,,m) would be an extension of (U,m) forcing ®S(x) T, contradicting 
monotonicity of the forcing relation (Proposition 7.4.2). 

Now define a function g as follows: for each x, find an € so that OY (x) | for 
every o € U of length €, and let g(x) = max{®9 (x): a0 € UA |o| = €}. Since f 
is a path through U, it follows that g(x) > of (x) for all x. Since e was arbitrary, 
it follows that every f-computable function is dominated by a computable function, 
as desired. oO 


We obtain the following well-known refinement of the hyperimmune free basis 
theorem (Theorem 2.8.22). 
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Corollary 7.5.9 (Jockusch and Soare [173]). Every infinite computable tree T © 
2<® has a low2 path of hyperimmune free degree. 


Proof. By Theorem 7.5.6 and Proposition 7.5.8, using the fact that for all sets A and 
B, A” <r B if and only if {e € w: (Vx) [®S(x) |]} <7 B. im 


Next, let us turn to Cohen forcing, which as usual exhibits some of the nicest 
behavior. In particular, here we can very simply calibrate the complexity of the 
forcing relation itself. The proof of the following is left to the reader (Exercise 7.8.7). 


Lemma 7.5.10 (Jockusch [169]). Fix n > 1 along with a 2° sentences y of L(G) 
and a 11° sentence w of L(G). Let o be a Cohen condition. 


1. The relation o  y is 2° definable, uniformly in an index for y and a. 
2. The relation o + w is T1° definable, uniformly in an index for w and o. 


This has the following consequence, showing that Cohen forcing enjoys an alternate 
characterization of n-genericity that can sometimes be easier to work with. In fact, 
this is often the way the notion is defined for this forcing. 


Proposition 7.5.11 (Jockusch [169]). A filter F is n-generic for Cohen forcing if 
and only if F meets or avoids every set of Cohen conditions definable by a =° formula 
of £2 (with no parameters). 


Proof. For the “only if” direction, fix a =° formula y(x) of £2 and suppose this 
defines a set S of Cohen conditions. Let w(X) be the formula (Av) [y(a7) Ao < X]. 
Then w(G) is a = formula of £2(G). Consider any t € F that decides w(G). If 
T It w(G), then by Theorem 7.4.6, w(G) holds for the real G determined by F’. This 
means some a < G belongs to S, and this 0 must belong to F (as remarked in 
Remark 7.2.9). Thus, F meets S. Suppose next that t Ik =w(G). Then no extension 
of t can belong to S, because any such extension would have a further extension 
forcing &(G) (and therefore would force both w(G) and =W(G), which cannot be). 
Indeed, suppose t* > t belongs to S, consider any n-generic filter F* containing t*, 
and let G* be the object determined by F*. Then w(G*) holds, as witnessed by r*, 
so some p > T* in F* must forces w(G). We conclude that t witnesses that F avoids 
S. 

For the “if” direction, fix any =? formula y of £2(G). By Lemma 7.5.10, the set 
{o € 2° : o & g} is 2° with the same parameters. If F meets this set then it 
contains a condition forcing y, and if F avoids it then it contains a condition that 
forces ay. oO 


A less technical application of Lemma 7.5.10 is the following, yielding even more 
effective bounds on Cohen generics than Theorem 7.5.6. 


Proposition 7.5.12 (Jockusch [169]). Fix n > 1. In Cohen forcing, we have the 
following. 


L. IfG € 2 is n-generic thnG™ <7 Geo™., 
2. There exists a lowy n-generic Set. 
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Proof. Let yo, y1,... be a computable listing of all x sentences of £(G) such that 
(2 is the sentence e € G‘”) for every e. If G is n-generic then for each i € w there 
must be ao < G that decides y;. By Lemma 7.5.10, @) can determine whether 
a given condition forces this sentence, and of course, G can determine whether 
a given Cohen condition is an initial segment of it. Hence, G @ @™) can find 
and determine whether it forces y; or not. By Theorem 7.4.6, for each e we have 
e € G\”) if and only if some o < G forces #2. Thus, G™” <p Ge o™, and this 
gives us part (1). For part (2), define D; = {a € 2<® : o It y;} for each i. Then 
D ={D; :i € w} is uniformly @ -effectively dense by Lemma 7.5.10. Hence, by 
Theorem 7.5.6 there is a D-generic G <; @\”), which is of course n-generic. By 
part (1), we have G™ <p G@O™ <_ @™, as wanted. o 


For comparison, as we have already seen in Example 7.3.6, there is no analogous 
result to part (2) for Mathias forcing. If n > 2, then in basically every flavor of 
Mathias forcing (with computable reservoirs, with cone avoiding reservoirs, etc.), 
every n-generic set is high (and so not low,). 

We stay with Mathias forcing now, and conclude this section with a series of 
results will be very useful to us in Chapter 8. 


Lemma 7.5.13. Let R= (R; : 1 € w) be an instance of COH, and consider Mathias 
forcing with R-computable reservoirs. For each i € w, let M; be the set of all 
conditions (E,I) with either I © R; or I C R;. Then for every X > R’, the 
collection {M; : i € w} uniformly X-effectively dense. 


Proof. Fix a condition (EZ, /) andi € w. Each of the statements “7M R; is infinite” 
and “I 1 R; is infinite” is 119(R), hence T19(R’ ). Moreover, at least one of the two is 
true. By Theorem 2.8.25 (specifically, part (4), as explained there) X can uniformly 
select one of the two which is true. Define /* = 1M R; if X selects the first statement, 
and J* = [A R; if it selects the second. Let E* = E. Then (E*, /*) is an extension of 
(E, I) in M;. o 


Lemma 7.5.14. Fix A € 2%, and consider Mathias forcing with A-computable 
reservoirs. 


1. For each i, let L; be the set of conditions (E, 1) with |E| > i. Then {L; : i € w} 
is uniformly effectively dense. 

2. For each e € w, let Je be the set of conditions that decide the sentence e € G’. 
Then {Je : e € w} is uniformly A’-effectively dense. 


Proof. For (1), given i and (E,/) we can pass to (FE Umin/,/ \ {min /}) € L;. An 
index for this extension is uniformly computable form an index for (E£, /). For (2), fix 
a condition (£,/) and e € w. Now A’ can determine, uniformly in e, whether there 
isan F C I such that ®£Y* (e) | (with use bounded by max F). If so, let E* = EUF 
and /* = {x € 1: x > F}. Otherwise, let E* = E and J* = J. In either case, (E*, /*) 
is an extension of (E, /) in Je. Oo 


Theorem 7.5.15 (Cholak, Jockusch, and Slaman [33]). Let. R= (R; : i € w) be 
an instance COH. For every X > R’, there is a solution G to R satisfying G’ <y X. 
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Proof. Foreachi, define D3; = M;, D3i4, = L;, and D342 = J;, where M;, L;, and J; 
are as defined in the lemmas. Let D = {D; : i € w}. As shown above, D is uniformly 
X-effectively dense. By Theorem 7.5.6, there is an X-computable generic sequence 
(Fo, lo) = (Eo, hh) = +++ with (2 3e42, I3e42) € Je for all e. Note that meeting all 
the L; implies that the valuations of these conditions have unbounded length, so the 
sequence determines an object, G. As in Example 7.3.9, G is R-cohesive. And for 
each e we have e € G’ if and only if (E3¢42, I3e42) Ik e € G’, which is to say, if and 
only if £2(¢) |. Hence, G’ <r X, as desired. o 


Corollary 7.5.16 (Cholak, Jockusch, and Slaman [33]). Every COH-instance R= 
(R; : 1 € w) has a solution that is low2 relative to R. 


Proof. Relativizing Proposition 2.8.14 to R’, we get an infinite R’- -computable tree 
T such that if f is a path through T then R'e f> R’ (in fact, f >_R’). By the low 
basis theorem, relativized to R’, there is some such /f satisfying (R’ ® f)’ <r R”. 
Now by Theorem 7.5.15 with X = R’® f, there is a COH-solution G to R such that 
G’ <r R’ @ f. Hence, we have G” <z (R’ @ f)’ <7 R”, as was to be shown. Oo 


7.6 Forcing in models 


The core use of forcing in reverse mathematics is to establish bounds on the instances 
and solutions of various problems, often for the purposes of building models wit- 
nessing separations between mathematical theorems. The framework described up 
to this point works for working over the full standard model, but we can modify it to 
work over models in general. 


Definition 7.6.1. Let M be an £2 structure. A notion of forcing in M is a triple 
P = (P,<,V) where (P, <) is a partial order and V is a map P > 2<™ satisfying 
the following: 


1. If p,q € P with p < q then V(q) <™ V(p). 
2. For eachn € M and q € P there isa p € P such that p < q andn <™ |V(p)|. 


This definition therefore agrees with our earlier one (Definition 7.2.1) for the cases 
where M is an w-model. 

One very useful example of the more general definition is when we wish to work 
over a Scott set. We will employ the following forcing in Chapters 8 and 9. 


Example 7.6.2. Let S be a Scott set . (Recall that this means S is an w-model such 
that every infinite binary tree T € S has an infinite path f €¢ S.) For X € S, we can 
define Mathias forcing within X with reservoirs in S to be the restriction to Mathias 
conditions (£,/) such that C X andJ eS. 


We can formulate notions of density and (generic) filters as before. Given a filter 
F such that for every n there is a p € F with |V(p)| >™ n we can also define 
G = Uner V(p), the (generic) object determined by F’. Notice that 


7.6 Forcing in models 195 


G(i) =b = (Ao € 2™) (Ap € F)[V(p) =a Ai <™ |c| Ac@ = 5] 
= (Vo €2“")(Vp € F)[LV(p) =o Ai <™ |o|Ac@ =]. 


Hence, if P and V are subsets of M then G is A° definable in M from F, P, and V. 

With appropriate modifications, most of the principal results from the previous 
section lift to the current setting, chief among them the following version of the 
Rasiowa-Sikorski theorem (Theorem 7.3.3). 


Theorem 7.6.3. Let M be a countable Ly structure, P = (P,<,V) a notion of 
forcing in M, p a condition, and D = {D; : i € M} a collection of dense subsets of 
P. There exists a D-generic filter containing p. 


As before, F above can always be chosen to determine an object. Thus, we freely 
follow Convention 7.3.4, appropriately re-interpreted. 

We can also adapt the definition of the forcing relation to this more general setting, 
although we need to take some more care here. 


Definition 7.6.4 (Forcing in a model). Let M be an £2 structure and P = (P, <, V) 
a notion of forcing in M. Let p be any condition, and ¢ an arithmetical sentence of 
L£5(G). We define the relation of p forcing y in M, written p ™ g, as follows. 


1. If y is an atomic sentence of L£, then p t™ vif Mr g. 
2. If pis t € G for aterm tf then pt™ gift <™ |V(p)| and V(p)(t™) = 1. 
3. If y is (Ax < t)W(x) for aterm t then p t™ w if there is an a € M such that 
a <™ (M and p e™ w(a). 
4. If pis Wo V Wy, then p t™ gif p ™ Wo or p E™ MH. 
5. If gis =, then pt™ vif g e«™ w forall g < p. 
6. If y is (Ax) W(x), then p t™ g if there is an a € M such that p t™ (a). 
We say p decides y in M if p t\™ or p t™ Ag. 


If M is an w-model then the above agrees with Definition 7.4.1. This is because 
every w-model is an w-submodel of M®, so by Theorem 5.9.3, w-models satisfy 
the same arithmetical formulas. (We say arithmetical formulas are absolute across 
w-models.) Thus if M is any w-model then t™ is just tr. 

With Definition 7.6.4 in hand, we can define n-genericity (Definition 7.4.3) much 
as before. 


Definition 7.6.5 (n-generic filters). Let M be a countable £» structure, P = (P,< 
,V) a notion of forcing in M, and n € w. A filter F is n-generic in M if every =° 
sentence y of £(G) is decided in M by some p € F. 


The principal results from the previous sections now carry over mutatis mutandis, 
most notably the algebraic properties of the forcing relation (Proposition 7.4.2) and 
the connection between forcing and truth (Theorem 7.4.6). 


Theorem 7.6.6. Let M be a countable £2 structure, P = (P,<,V) a notion of 
forcing in M, n € w, F ann-generic filter determining an object G € 2™, and y(X) 
a X° or T12 formula of Ly. Then M[G] & y(G) if and only if some p € F forces 
y(G) in M. 


Here, M[G] is as in Definition 5.9.5. 
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7.7 Harrington’s theorem and conservation 


An important application of the generalization of forcing to general models is to 
conservation results. We introduced the basic framework for this in the previous 
chapter, with Corollaries 5.10.2 and 5.10.7. Each of these earlier results is obtained 
by definitional extensions. Given a model, we expand its second order part by adding 
suitably definable sets. As a consequence, formulas involving these new sets can be 
immediately transformed into formulas not involving them, whence properties like 
comprehension and induction transfer from the original model to the expanded one. 
Now, we will instead add sets by forcing. The tension then is that the more generic 
a set is, the less definable it is over the original model. 

The point of this section is to prove Harrington’s theorem, which says that WKLo 
has the same first order part as RCApo. First, though, let us look at the analogous 
result for COH, which is simpler and can serve as a warm-up. We will see this result 
generalized in Theorem 8.7.23. 


Theorem 7.7.1 (Cholak, Jockusch, and Slaman [33]). RCAg + COH is iN con- 
servative over RCAo. 


Proof. Fix an £2 structure M and R € S™ such that 
M & RCAo + “R is an instance of COH”. 


By Corollary 5.9.6, it suffices to produce a set G such that M[G] F ize and 


M([G] & “G is an infinite R-cohesive set”. 


We use Mathias forcing in M. Conditions will thus be pairs (E, J) such that E,/ € 
S™., E is M-finite, J is M-infinite, and M + E < J. For any sufficiently generic 
G for this forcing we will certainly have, in M[G], that G is an infinite R-cohesive 
set, by the same argument as in Example 7.3.9. So, we have only to verify that 
M[G] F ee Seeking a contradiction, say p(X, x) isa = formula of £2 such that 


M[G] § 9(G,0) A (Vx) [p(G,x) > o(G,x +1)] A 79(G, a) 


for some a € M. By Theorem 7.6.6, there is a Mathias condition (£,/) such that 
M[G]t E © GC EU /and such that (£, J) forces 


y(G,0) A (Vx) [y(G, x) > y(G,x + 1)] A ay(G, a) (7.1) 


in M. Fix a pa formula w(X,x,y) such that y(X,x) is equivalent in M to 
(Ay)w(X,x, y). Let S be the set of all b <™ a such that there is a y and a fi- 
nite set F ¢ J with M & W(E U F,b, y). Thus, S is ~°-definable in M, and so 
belongs to S™ by bounded zt comprehension. Now, by definition of the valuation 
map for Mathias forcing, we see that for all b <™ a, if b € S then (E,/) has an 
extension that forces y(G, b), and if b ¢ S then (E, J) It =y(G, b). By assumption, 
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0 ¢ Sanda ¢ S. Hence, by 3 in M, we can fix ab <™ a such that b — 1 € S and 
b ¢ S. It follows that there is an extension of (£, /) that forces y(G, b—1) Ang(G, b). 
But this extension must still force (7.1) in M, so we have a contradiction. oO 


Remark 7.7.2. A crucial point in the above proof, which is common across forcing 
constructions, is that the complexity of the formula we are interested in forcing 
matches the complexity of forcing that formula. Sometimes, we get this automatically 
from the definition of forcing, as above. But if the formula is more complicated, or 
the forcing partial order is more complicated, this need not be the case. There are 
different strategies for dealing with such situations. We may be able to overcome the 
obstacle simply by organizing the proof in a careful way (as in Theorem 7.7.3, which 
we are about to look at). In other cases, we may need instead to force a more general 
formula (as in Lemma 8.5.7 below), or to restrict our set of conditions somehow (as 
in Propositions 8.7.4 and 8.7.5). 


We now turn to the main result of this section, which is the following seminal 
result of Harrington. 


Theorem 7.7.3 (Harrington). WKLo is i, conservative over RCAo. 


Again, we give a model extension argument. Given an instance T of WKL in a model 
M of RCAo, how do we pick a path G through T so that M[G] satisfies x? induction? 
(It is not difficult to see that we cannot pick a path arbitrarily. See Exercise 7.8.8.) 
The rough idea is to obtain G by the low basis theorem, but formalizing this is quite 
delicate. 

We will use Jockusch—Soare forcing inside a model M of RCAp. Our conditions 
will thus be pairs (U, m) such that M satisfies that U is an infinite tree with a unique 
string of length m € M, with (U*,m*) < (U,m) ifm <™ m* and M satisfies that 
U* CU, and with the valuation of (U, m) being its unique element of length m. We 
need the following technical lemma. 


Lemma 7.7.4. Fix a countable £5 structure M and aT € S™ such that 
M & RCAg + “T is an infinite binary tree”. 


Let y(z) bea 2} formula of £2(G). There exists a 2} formula (p(x, Z) such that for 


every tuple b of elements of M and every Jockusch—-Soare condition (U,m) in M 
the following hold. 


LIfMet gi(o, b) for some  € 2<™ then M & y3 (rt, b) for every T a om 

2. If Met gi(o, b) for the unique o € U of length m then (U,m) r~™ (bd). 

3. If (U,m) + y(b) then there is some (U*,m*) < (U,m) in M such that 
Mt y3(o,b) holds for the unique o € U* of length m*. 


Proof. We define v7 along with an auxiliary formula yY (x, Z) such that for every 
tuple b of elements of M and every condition (U,m) in M the following hold. 


4. If Me y'(o, b) for some o € 2<™ then M yp (p, b) for ever p <™ a. 
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5. IfMer (oc, b) for all o € U then there is some (U*,m*) < (U,m) in M such 
that (U*, m*) tM (db). . 
6. If (U,m) t™ (db) then M & vy’ (c,d) for all o € U. 


View y(z) as (G, z) for a formula y(X, z) of £2. We proceed by induction on the 
complexity of y. First, suppose is atomic. If X does not occur in ¢, then for every 
tuple b, y(b) is a sentence of £2 which is either true in M and every condition 
forces it, or false in M and every condition forces its negation. In this case, we let 
gi =" =9. 

If X does occur in y, then y is t € X for some term ¢. We let 


pr(x,2Z) =t < |x| Ax(t)=1 


and 
yp’ (x,Z) =t < |x] > x(f) = 1. 


We now complete the induction. If y = -=W we let 7 = =’ and yY = =. If g = 
WoVW we let yp? = WeVuz and gY = Wh VW. And if p(X, Z) = (Ay < )w(X, y.2), 
we let y=(x, Z) = (Ay < t)W(x, y, Z) and gy" (x, Z) = (Ay < Dw" (x, y, 2). 

Verifying properties (1)-(6) is straightforward in all these cases, with the excep- 
tion of property (5) in the very last case. So suppose that for some tuple b of elements 
of M and some condition (U,m), M & y’(c, b) for all o € U. We claim that for 
some a <™ t the set of all o € U such that M & w’(c, a, b) is M-infinite. If we 
have this, let U be the set of all these o. Then U is a tree in M by property (4), so 
(U, m) is an extension of (U,m). Now by induction there is an extension (U*, m*) 
of (U ,m), and hence of (U, m), that forces WY (G, a, b). But then this extension also 
forces (Ay < t)W(G, y, b), which is y(G, b), so we have (5) for y’. This means that 
proving the claim finishes the proof. 

Seeking a contradiction, suppose the claim is false. Then for every a <™ 
the set of o € U such that M & WY (a, a,b) is Mc-finite. In particular, for every 
a <™ {™ there is a level £ € M such that M F aw (o, a, b) for every a7 € U 
with |v | = €. Notice that by hypothesis on w and property (4), this means also that 
M & =w’ (1,4, b) for every tT with |t| > €. Now since w is =. we can apply Bx? 
in M to find an m € M such that for every a <™ ¢™ there exists such an ¢ with 
€ <™ m. But then M & aw’ (rt, a,b) for every 0 € U with |o| = m and every 
a <™ {™, which cannot be. oO 


1”, 


Proof (of Theorem 7.7.3). Fix a countable £5 structure M and aT € S™ such that 
M & RCAp + “T is an infinite binary tree’. 


Consider again Jockusch—Soare forcing with subtrees of T inside M. Let G be 2-S™- 
generic in M. We then claim that M[G] & RCAo + “G is a path through T”. From 
here, the result follows by Corollary 5.9.6. That M[G] satisfies AS comprehension 
follows by definition, while the fact that G is a path through T in M[G] follows by 
genericity. So it is enough to show that M[G] satisfies = induction. 
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Seeking a contradiction, suppose y(X, x) is a xt formula of £2 such that 
M[G] & v(G,0) A (Vx) [¢(G,x) — y(G,x + 1)] A ay(G, a) 


for some a € M. By Theorem 7.6.6, there is a condition (U, m) such that G is a path 
through U and 


(U,m) t™ (G,0) A (Vx)[y(G,x) > v(G,x + 1)] A -=9(G, a). (7.2) 


Write y(X, x) as (Ay)W(X, x, y), where w is pd Let 7 be as given by Lemma 7.7.4. 
Then for each b € M, the set U, = {0 €U: Mt (Vy < |o|)-W3(c,b, y)} is 
M-infinite if and only if, in M, there is an extension (U*, m*) of (U, m) that forces 
=¢(G, b). So if we let A be the set of all b <™ a for which U; is M-infinite, then 
a € Aand 0 ¢ A. Now A € S™ by bounded m1 comprehension, and so it has a 
least element, b >™ 0. It follows that there is a (U*,m*) such that M satisfies that 
(U*,m*) is an extension of (U,m) forcing y(G, b — 1) A ay(G, b), contradicting 
(72): Oo 


Using a more refined argument, it is possible to obtain a version of Harrington’s 
theorem for xe instead of Pe 


Theorem 7.7.5 (Cholak, Jockusch, and Slaman [33]). Every countable model of 
RCAo + xe is an w-submodel of a countable model of WKLo + [x0 


We omit the proof here. As a corollary, WKLo + ba is 1; conservative over RCA + 
[xe This fact, which is of independent interest, actually holds at all levels of the 
arithmetical hierarchy. 


Theorem 7.7.6 (Hajek [132]; Avigad [7]). For all n > 1, WKLo +12) is TI} 
conservative over RCAg + bee 


7.8 Exercises 


Exercise 7.8.1. Let G be Cohen 1-generic. 


1. Write G = Go © G,. Show that Go and G; form a minimal pair, meaning that 
for any set X computable from both Go and G1, X is computable. 
2. Write G = @,.,, Gi. Show that for all i, G; <r B Gj. 


iew jew, jti 


Exercise 7.8.2. Prove part (1) of Proposition 7.5.8. 


Exercise 7.8.3 (Yu [329]). Let G be Cohen n-generic and write G = Go @ G,. 
Show that Go is Cohen n-generic and that G; is Cohen n-generic relative to Go. 
(This is analogous to van Lambalgen’s theorem (Theorem 9.4.17) from algorithmic 
randomness. ) 
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Exercise 7.8.4. Give an example of a notion of forcing, a sentence ¢ of the forcing 
language, and a condition that does not force y V 7. 


Exercise 7.8.5. Show that in any forcing notion, if y is any sentence in the forcing 
language and p any condition, then p I+, y if and only if p It --79. 


Exercise 7.8.6. Complete the proof of Proposition 7.4.2. 
Exercise 7.8.7. Prove Lemma 7.5.10. 


Exercise 7.8.8. This exercise explains why the path in Harrington’s theorem (Theo- 
rem 7.7.3) had to be chosen so carefully. Fix n > 1, let M be a model of RCAg+ 1x9, 
Let T = 2<™, so that T € S“ and M satisfies that T is an infinite binary tree. Show 
that there is a path f through T such that M[f] & 1z?. (Hint: Use the fact from 
Theorem 6.2.3 that M has a ©° definable proper cut.) 


Exercise 7.8.9. Let (£, 7) be Mathias conditions, let p(X) be a xt formula of Lo. 
Show that if (E, 7) t y(G) then y(S) holds for every set S with EC SC EUI. 


Exercise 7.8.10. Let (E*,1*) < (E£,1) be Mathias conditions and let y be a x? 
formula of £2(G). Show that if (E*,/*)  y then so does (E*, 7 \ max E*). So 
in Mathias forcing, = formulas can always be forced by finite extensions. (Hint: 
Induction on complexity.) 


Exercise 7.8.11. Show that in Mathias forcing with computable reservoirs there exist 
low 1-generics. (Hint: Use Exercise 7.8.10.) 


Part III 
Combinatorics 


Chapter 8 ®) 


Check for 


Ramsey’s theorem cpa 


Ramsey proved his eponymous theorem in his 1929 paper “On a problem of formal 
logic” [253]. The focus was actually not on combinatorics at all, but rather on the 
Entscheidungsproblem, which was still open at the time. (The dramatic results of 
Church and Turing showing that the problem is unsolvable were still a few years 
away.) Ramsey used his theorem to study a special case of the problem. More 
precisely, he used the finitary Ramsey’s theorem (Definition 3.3.6), though in fact 
he proved the infinite version first and then deduced the finitary one from it. (This 
is still a common proof.) In the intervening hundred years, both versions have had a 
vast impact and legacy. Our interest will be confined to the infinite version. 

In broad terms, Ramsey’s theorem can be thought of as saying that some amount 
of order is necessary in any configuration of objects. Understanding this order—how 
it is organized and how it can be described—has been the focus of much interest 
in combinatorics and logic, not least because it is so intrinsically captivating. In 
computability theory specifically, it has spawned a long and fruitful line of research. 

The earliest foray into the effective aspects of Ramsey’s theorem was by Specker, 
who showed that Ramsey’s theorem does not hold computably. 


Theorem 8.0.1 (Specker [300]). For every n > 2, RT? omits computable solutions. 


Historically, this was an important result, showing that a purely combinatorial result 
could have interesting logical (and in particular, computability theoretic) content. The 
mantle was then picked up by Jockusch in his seminal paper “Ramsey’s theorem and 
recursion theory” [168], in which he not only significantly extended Specker’s result, 
but brought to bear the full machinery of the time to give a remarkably deep analysis 
of Ramsey’s theorem from the point of view of computability and complexity. This, 
it can be said, founded and firmly established computable combinatorics as a viable 
field. 

In this and the next chapter, we survey the reverse mathematics of combinatorics, 
starting now with Ramsey’s theorem itself. Note that we have collected a number 
of results about Ramsey’s theorem already, in Chapters 3, 4 and 6. For example, we 
have Hirst’s theorem (Theorem 6.5.1) that RT! is equivalent over RCAg to Bx?. We 
have also seen, in Corollary 4.5.10, that RT” =,, RT; for every k > 2. The proof 
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can be easily formalized to show that for all (standard) k > 2, RCAg F RTS <> RTX. 
(The point being, that when & is standard the induction in the proof can be carried 
out externally, rather than in RCAg.) Of course, it is also trivially the case that 
RCAg + (Wn)RTY. 

For completeness, we mention also that RCAg proves the finitary Ramsey’s the- 
orem (FRT) defined in Definition 3.3.6. A proof in PAT + Ix? is given by Hajek and 
Pudlak [134, Chapter II.1]. Since FRT can be expressed as an arithmetical, and hence 
i statement, provability in RCAo follows by Corollary 5.10.2. 

We begin our discussion by understanding the complexity of homogeneous sets 
in terms of the arithmetical hierarchy. This will also give us some preliminary results 
in terms of subsystems of second order arithmetic. 


8.1 Upper bounds 


Our first result is the following important theorem of Jockusch. 
Theorem 8.1.1 (Jockusch [168]). For all n > 1, RT” admits nn? solutions. 
For n = | this is trivial. For n > 2 it follows by taking J = w in the following lemma. 


Lemma 8.1.2 (Jockusch [168]). Fix n > 2, k > 1, and an infinite set I. Then every 
c: [I]? > k has an infinite homogeneous set H € I that is 12(¢ @ 1). 


To prove it, we introduce an auxiliary notion that is quite helpful in certain kinds 
of constructions of homogeneous sets. 


Definition 8.1.3. Fix n > 2, k > 1, and a coloring c: [w]” — k. A set F is pre- 
homogeneous for c if for each x € [F]"~! there is ani < k such that c(x, y) = i for 
all y € F withx < y. 


We leave the following to the reader and proceed to the proof of the main result. 


Lemma 8.1.4. Fix n > 2, k > 1, and a coloring c: [w]" — k. If P is an infinite 
pre-homogeneous set for c then c has a (c ® P)-computable infinite homogeneous 
set H C P. 


Proof (of Lemma 8.1.2). By induction on n. 


Inductive basis. Assume n = 2. As usual, we restrict to computable instances 
for simplicity, with the full result following by relativization. So fix k > 2, an 
infinite computable set 7, and a computable coloring c: [I]? — k. We begin by 
constructing a iM infinite set P C I which is pre-homogeneous for c. We use a @’ 
oracle to enumerate elements of / into P by stages. At each stage s, we define a finite 
nonempty set F, of elements of J not yet enumerated into P, as well as a function 
d,: F, — k. We enumerate at most finitely many elements into P at each stage. 
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Construction. 
Stage s = 0: Let Fo = {min J} and do(min /) = 0. 


Stage s > 0: Assume inductively that F;_; and ds_; have been defined. Fix the 
largest z € F;_; for which the following holds: there exists w > F,_; in J such that 
w has not yet been enumerated into P and c(x,w) = ds_1(x) for all x < z in Fy_). 
Note that z exists, since the latter condition is always satisfied by z = min Fs_;. Fixa 
corresponding w that minimizes c(z, w), and let F, = {x € Fy : x < z}U {w}. Then, 
define 
ds-\(x) ifx <z, 
ds(x) = jc(z,w) ifx =z, 


0 if x = w. 
Finally, enumerate all numbers x < s in J \ F, into P. 


Verification. As the construction is computable in ’, it follows that P is x9(0’). 
Hence, P is 1, as desired. We prove the following claims. 


Claim I: A number x € I belongs to P if and only if x € Fs for all sufficiently large 
s. By construction, x € J is enumerated into P at a stage s if and only if x < s and 
x ¢ Fy. 


Claim 2: If x € P and x € Fs for some s then x € F; for allt > s. lf x € Fs \ Fs+1 
then it must be that the numbers z and w found at stage s + 1 of the construction 
satisfy z < x < w. But then after stage s + 1, only numbers larger than w are ever 
eligible to be part of P. 


Claim 3: For each x € P, lims ds(x) exists. Fix x € P, and assume the claim for all 
y <x in P. By the first claim, we can find sy > x such that for all s > so, if y < x 
belongs to P then y € Fy, and if y < x is in P then ds(y) = ds, (y). Note that since 
s > x, every y < x in/\ Pis enumerated into P by stage so, so no such y can belong 
to F; for any s > so. Now the only reason we could have d(x) # ds, (x) for some 
S > Sg is if x is the number z found at stage s of the construction. By construction, 
and our hypothesis on the y < x, this means there are infinitely many w € J such that 
c(y, w) = ds, (y) for each y < x in P, but c(x,w) # ds, (x) for almost all such w. It 
follows that the value of d,(x) is changed in at most finitely many stages 5 > so. 


Claim 4: P is pre-homogeneous. Fix x € P, and let so be the least stage s so that 
x € Fy. By minimality of so, either so = 0 and x = min/, or sg > O and x is the 
number w found at stage so of the construction. In any case, x is the largest element 
of Fs). By Claim 2, if s > so then x € F,. Thus, the number z found at any such 
stage s of the construction must satisfy x < z. Moreover, as noted in the preceding 
claim, the number of such stages s at which x = z must be finite. Let 5; > so be 
the final stage at which x = z, or sg + | if there is no such stage. Thus, F’, contains 
precisely one element larger than x and dy, (x) = ds(x) for all s > s;. Moreover, at 
every stage s > s; an element w is found and added to F;, and this number must 


satisfy c(x, z) = ds(x) = ds, (x). 
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Claim 5: P is infinite. The proof is similar to that of Claim 4. Suppose x € P. We 
exhibit a larger element of P. Let s; be as in Claim 4, i.e., the last stage s such that x 
is the number z found at this stage. Thus F;, contains precisely one element w > x, 
and we have ds, (x) = ds(x) forall s > s1, and ds, (x) = c(x, w) by definition. Hence, 
w is never enumerated into P and so belongs to P. 


The verification is complete. 


To complete the proof of the n = 2 case, define H; = {x € P:: lims ds(x) = i} for 
each i < k. Thus, each H; is homogeneous for c with color 7. By construction, for 
each x € P the value of d, (x) can only increase with s. Hence, for each i we have that 
x € U;<; H; if and only if d(x) < i for all s. Since d <y ’, this means U ;<; Hj 
is T1)(2’) and therefore m1. Let i < k be least such that H; is infinite, and fix m so 
that no x > m has lim, d,(x) < i. Then H = {x € Uj<; Hj :x > mpisa 18s infinite 
homogeneous set for c. 


Ji 


Inductive step. Assume n > 2 and the result holds for n — 1. Fix k > 2,7 Cw, 
and a coloring c: [/]” — k. By the low basis theorem (Theorem 2.8.18) relative to 
(c@I)’, fix X > (c@/)’ with X’ <p (c@ J)”. It then suffices to produce an infinite 
homogeneous set H for c which is Tr, (X). For, by Post’s theorem (Theorem 2.6.2), 
we have that 


T?_,(X) =1?_,(X) cT_j((cen”) =Micen 


as classes of sets. We first define an X-computable set J* = {xp < x1 <---} CT 
inductively as follows. Let x9 < --- < x, 2 be the least n — 1 elements of J and 
define J,_; = {x € I: x > Xj_2}. Suppose next that for some s > n — 1 we have 
defined x9 < --+ < xs_; along with a (c @ /)-computable infinite set 7, ¢ J with 
xs < I;. Since s > n— 1, we may fix m > 1 so that there are m elements of 
[{xo,.-.,Xs-1}]”~! containing x,_;. Enumerate these as X9,...,Xm—1. Now define 
Jo = Is and suppose that for some € < m we have defined a (c ® J)-computable 
infinite set Je C I,. Since X > (c @ 1)’, it follows by Theorem 2.8.25 that X can 
uniformly compute a color ig < k such that there are infinitely many y € J¢ with 
c(Xe, y) = ig. Let Jer, = {y € Je : c(Xe, y) = ic}. Finally, let x, be the least element 
of Jin, and let I54, = {x € Jin : x > xs}. Thus, for each € < m we have c(x¢, x5) = i¢ 
and c(x¢, y) =ig¢ forall y € Is. 

By construction, if x € [/*]""! then c(x,y) is the same for all y > x in J”. 
We may thus define a (c © /*)-computable, and hence X-computable, coloring 
c*: [I*]""! — k where c*(X) = c(X,y) for the least y > x in J*. By induction 
hypothesis, there is a MM :, (X) infinite homogeneous set H € I* C J for c*. Clearly, 
this is also homogeneous for c. The proof is complete. oO 


The fact that Ramsey’s theorem admits arithmetical solutions should lead us to 
believe that the theorem is provable in ACAo. Indeed, for each fixed exponent, this is 
the case. Formalizing the kind of argument we just gave is tricky, however. Instead, 
we give a separate argument that is more direct and less delicate. 
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Proposition 8.1.5. For all n > 1, ACAg + RT”. 


Proof. For n = 1, this follows by Hirst’s theorem (Theorem 6.5.1) and the fact that 
By is provable in ACAo. For n > 2, we recall the proof of Proposition 4.5.12 which 
gave an explicit construction of an infinite homogeneous set for a given k-coloring. 
This can be readily formalized in ACAg. Note that the proof featured an induction: 
for each m < n, we used RT” to prove RT’’*!. But as n here is a fixed standard 
number, and the induction is finite (proceeding only up to m = n — 1), we are not 
actually using induction within ACAg here. oO 


What of the full Ramsey’s theorem, RT? It is still a problem that admits arith- 
metical solutions, but both Theorem 8.1.1 and the proof of the preceding proposition 
seem to suggest that the complexity of the solutions increases with the exponent. 
And we will soon see that this is unavoidable. (So, e.g., there is no m € w so that 
every computable instance of RT has a @“”)-computable solution.) From here, the 
fact that ACA does not prove RT follows by Theorem 5.6.7. (Another way to see 
this is described in Exercise 8.10.8.) The evident reason is that ACAg cannot prove 
(Wn)(WX)[X exists]. That much is the role of the of the subsystem ACA), from 
Definition 5.6.5, discussed in detail in Section 5.6.1. And this turns out to capture 
the strength of RT precisely. 


Theorem 8.1.6 (McAloon [207]). RCAg RT << ACA). 


Proof. First, we show ACA) + RT. Arguing in ACA), fixn > 2,k > 1,andc: [N]" > 
k. Let ao be an increasing string of length n — 1. Let T be the set of all a < ao as 
well as all increasing a €¢ N<N satisfying the following conditions. 


"ago xXa. 

* range(q) is pre-homogeneous for c. 

¢ For each j < @, there are infinitely many y such that c(x, y) = c(x, a(J)) for 
every x € [range(a | j)]‘"~), and a(/) is the least such y. 


T is clearly a tree, and it is finitely branching since for each a € T and eachi < k 
there can be at most one x such that ax € T. Note that the defining conditions above 
are uniformly arithmetical in n, and so T is A°,,-definable, for some (standard) 
number ¢. Thus, T exists. 

We claim that T is infinite. Since T is finitely branching, we must show that for 
each f there is an a € T with |a| = €. We proceed by induction on @. If € < n—-1, 
we have avg in T along with its initial segments. Suppose next that f = n. There is a 
single tuple x € [range(ao)]""!, and by RT!, there is ani < k such that c(x, y) =i 
for infinitely many y. Fix some such /, and the least corresponding y. Then apy € T. 
Finally, suppose € > n, and assume the result holds for — 1. Fix any a € T of length 
€—1 and let x = a({a| — 1), so that a = a*x. Let I be the set of all y > x such that 
c(x, y) = c(X, x) for all x € [range(a*)]-"), which is infinite by assumption. Now 
enumerate the tuples x € [range(a)]("~) that contain x as X9,...,Xm—1. Define 
p:m — k by recursion, as follows: for each s < m, let p(s) be the least i < k 
such that there are infinitely many y € J such that c(x;, y) = p(t) for all t < s and 
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c(Xs, y) = i. Let I* be the set of all y € J such that c(xs, y) = p(s) for all s < m, 
which is infinite by construction. Let x* = min /*; then it is easily seen that ax* € T. 
The claim is proved. 

Since T is a finitely branching, infinite subtree of N<", we may fix a path f € 
[T]. It is clear that range(f) is an infinite pre-homogeneous set for c. For each 
x € range(f), let d(x) = c(x, y) for the least y > x in range(f), which gives us 
an instance d: range(f) — k of RT;. As in Lemma 8.1.4, it now follows that if 
H C range(/f) is homogeneous for d then H is homogeneous for c. This completes 
the proof that ACA, + RT. 

Now assume RCAy + RT. We RT > RT?: in Corollary 8.2.6 below, we will see 
that RT? — ACAo. Thus, we may argue in ACAg + RT. To derive ACA), fix a set X 
and n > 1. W must show that X‘”) exists. For an arbitrary set Z and so € w, define 


0 
Zz = Zo I So, 
and fori > 1 and so,..., 5; € w, define 


(i) Za 
26.9 = 16 < 893 (Ss < 81) [O° (e) LU}. 


Here we refer to a coding of Turing functionals in second order arithmetic, as 


exists for alli and so,..., 5;. 
We now define a coloring c: [w]*”*! — n+2, which is an instance of RT. Given 
SQ <S] <+++< Sy <t) <-+++ < ty, first check if there is ani < n such that 


KP occccie IO” 


‘SQ.t1,---.f;° 
If so, set cC(s9, 51,---,Sn5t1,---,tn) = i for the least such 7. And otherwise, set 
C(S0, 51,--->Snott,---5tn) =n+t 1. Clearly, c exists, and we may thus apply RT to 


obtain an infinite homogeneous set H for it. 
We claim that for every i < n and all so < --- < s;in A, 


Notice that the statement of the claim is arithmetical, so since we are working in 
ACAo we can prove it by induction. For i = 0, this is clear. So suppose 1 < i < nand 
the result holds for i — 1. Thus for all t; < --- < t; in HW we have that 


Fix so <--- < s; in H. Fix u so that for all t > u and all e < sg, we have that 
OX" (6) Loo (As < N[OX"" "(e) JL]. 


Then for any tuple t; < --- < ¢; in HW with fr; larger than so and u, we have 
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(i) xf, 
Xsostiyeunts = 1¢ < So: (As < 1) [®..5°°" (e) LI} 
(i-1 
= {e < so: (As <n)[OX. (e) J} 
= xX l so. 


But t),...,¢; were arbitrary. So if we take any fj41 <-°+ <t) <uy <-+:<u,inl 
with ¢;4; larger than ¢;, then 


(i) _ yli _ yy 
X sont senvoti a x® iY) 7 Xso.u seca uj? 
and consequently c(50, f1,..-,fn,W1,---,Un) #1. Since H is homogeneous, we must 
have c(s0, 51,---5Sp5ti,..-5tn) #i, and therefore 
(i) _— yy _yli 
XS0.81 5e0-98i = Pe = x! ) l so, 


as desired. 
With the claim in hand, we proceed as follows. Let p: w — H be the principal 
function of H. Then, for each i < n, define 


= (i) 
i U ASG meee p(jti)’ 


JEW 


which exists by arithmetical comprehension. As we showed, x ig 
P(J)>---P Gti) 


X | p(j) for every j, and therefore X; = X. Thus, (Xo,..., Xn) is a sequence 
witnessing that X") exists. The proof is complete. oO 


8.2 Lower bounds 


We now turn to lower bounds. Our first task is to show that the II? bound from 
Theorem 8.1.1 cannot be improved. This also yields a significant strengthening of 
Specker’s theorem (Theorem 8.0.1). The following lemma will be useful here and 
subsequently. 


Lemma 8.2.1. Fix n,k 2 1, and let c*: [w]" — k be a @'-computable coloring. 
There exists a computable c: [w]"*! — k such that every infinite homogeneous set 
for c is homogeneous for c’. 


Proof. Let c be a computable approximation c* as given by the limit lemma. That 
is, cis a map [w]*! > k such that lim, c(x, s) = c*(x) for every tuple x € [w]”. It 
is easy to see that this has the desired property. oO 


Theorem 8.2.2 (Jockusch [168]). For each n > 2, RT5 omits » solutions. 


Proof. We aim to show that for every set A there is an A-computable instance c of 
RTS with no =° (A) solution. It suffices to show c has no iM (A) solution since every 
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£°(A) infinite set has a A°(A) infinite subset. The proof is now by induction on 
n>2. 


Inductive basis. Fix n = 2. We construct a computable coloring c: [w]* — 2 with 
no infinite A° solution. The result follows by relativization. We aim to ensure that for 
each e, ®,(2’) is not (the characteristic function of) an infinite homogeneous set for 
c. We thus assume throughout the proof that if O,(@’)(x) | then ®.(@’)(x) < 2 and 
also ®.(@’)(y) | for all y < x. For each e, we also fix a computable approximation 
{Des : 5 € w} to ®.(G’). Thus De.s is a finite subset of w [ s, and for each x < 5s 
we have x € D-., if and only if ®.(@’)(x)[s] |= 1. In particular, if ®.(@’)(x) = 1 
then x € D..s for all sufficiently large s, and if D.(@’)(x) = 0 then x ¢ D,., for all 
sufficiently large s. 

We construct c by stages. At stage s > 0, we define c(x, 5) for all x < s. Fix 
s, and assume c is defined on all pairs (x, y) with x < y < s. For eache < sin 
order, proceed as follows. Compute the set D-.,, and if it contains at least 2e + 2 
many elements, fix the least two that are not claimed (to be defined below) by any 
e* < e. Say these numbers are x < y, and say that e claims these numbers. Then, 
define c(x,s) = 0 and c(y,s) = 1. If De,, does not contain at least 2e + 2 many 
elements, do nothing; in this case, e claims no numbers. Now if e + 1 < s, proceed 
to e + 1 (meaning, repeat the above with e + | in place of e). If instead e+ 1 = s, 
define c(x, s) = 0 for any x not claimed by any e* < s, and move to stage s + 1. This 
completes the construction. 

We clearly end up with a computable coloring c: [w]? — 2. To verify that c has 
no AS infinite homogeneous set, fix any e such that ®,(@’) is total and (the set it 
defines) contains at least 2e + 2 many elements. Let x9 < --- < X2¢+41 be the least 
2e+2 many such elements Now fix so so that for all s > so and all x < x2¢41 we have 
that D. ,(x) = ®,.(@’)(x). Thus for all s > so, the least 2e + 2 elements of D.., are 
precisely x9 < +--+ < X2e41. Since each e* < e claims at most two numbers at every 
stage, collectively these e* can claim at most 2e many numbers. Thus, at every stage 
s > max{e, so}, e will be able to claim x; < x; for some i < j < 2e +2, and we 
will thus define c(x;, 5) # c(x;, 5). Notice that i and j here may depend on s! But 
if x9,...,X2e41 are all part of an infinite homogeneous set for c, it follows that this 
set cannot contain any s > max{e, sg} and so must be finite. Thus, ®,(@’) cannot 
be an infinite homogeneous set for c, as was to be shown. 


Inductive step. Fix n > 2 and assume the result holds for n — 1. Then we may fix 
a @’-computable coloring c*: [w]"~! — 2 with no AO (’) infinite homogeneous 
set. By Lemma 8.2.1, we may fix acomputable c: [w]” — 2 such that every infinite 
homogeneous set for c is also homogeneous for c*. In particular, c has no As (@’) 
9 so c witnesses the desired result. O 


n? 


infinite homogeneous set. But ae (O’)=A 
The following consequence is immediate. 
Corollary 8.2.3. For n > 2, RCAg ¥ RT3. 


In fact, for n > 3 this result can be significantly improved. Indeed, we can show that 
Ramsey’s theorem can, in fact, encode a lot of specific information. 
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Theorem 8.2.4 (Jockusch [168]). Fix a set A. For every n > 2, there exists an 
A-computable coloring c: [w]" — 2 such that if H is any infinite homogeneous set 
forc then A") <p A@H. 


Proof. The case n = 2 is trivial, so we may assume n > 3. We take A = @ for 
simplicity. The full theorem easily follows by relativization. 

First, we claim that there is an increasing @‘"~)-computable function f: w > w 
such that if g: w — w is any function that dominates f (meaning, f(x) < g(x) for 
almost all x € w) then @'"-?) <p g. Namely, let 


f(x) = sup{y €w: (Je < x) [62 (x) [= y}. 


Clearly, f <; @'"-?). Moreover, f dominates every partial function computable in 
@("-3), and any such function computes @'"-?), (See Exercise 8.10.2.) Since any 
function that dominates f will also have this property, the claim is proved. 

We proceed as follows. Define c*: [w]? — 2 by c*(x,y) = 1 if and only if 
y > f(x). For every x we have c*(x, y) = 1 for almost all y. Hence, all infinite 
homogeneous sets for c* have color 1. And since f is increasing, it follows that if 
H = {ho < h, < ---} is homogeneous for c* then hy4; > f(hx) > f(x) for all x. 
Thus, the principal function of H \ ho dominates f, meaning @'"~?) <p H. 

Now, c* is computable from f and hence from @‘"~*). If we iterate Lemma 8.2.1 
n— 2 times, we obtain a computable c: [w]” — 2 such that every infinite homo- 
geneous set H for c is homogeneous for c*. By construction, this is the desired 
coloring. oO 


In the parlance of Section 3.6, we get the following consequence. 

Corollary 8.2.5. For each n > 3, RTS codes the jump (i.e., TJ < RT}). 

Proof. Relativize the proof Theorem 8.2.4. oO 
Corollary 8.2.6. For all n > 3 and k > 2, RCAg + ACAg < RT @ RT”. 


Whether Corollary 8.2.5 (and Corollary 8.2.6) also holds for n = 2 was left as an 
open question by Jockusch in [168]. Surprisingly, the answer turns out to be no, as 
we discuss in the next section. Before that, one final remark is that using the results of 
this section we can now prove Proposition 4.6.11. Recall that this said the following: 
for each n > 3, we have RT =,, RTS, but RT neither admits computable solutions 
nor is it computably reducible to m applications of RT? for any m. 


Proof (of Proposition 4.6.11). Fix n > 3. First, we show RT <, RT. Fix an o- 
model S of RT}. Fix any X € C. Then for each m > 1, by taking A = KS) ahi 
Theorem 8.2.4, it follows by induction that X) ¢ S. Now consider an arbitrary 
instance c of RT in S. This is a coloring c: [w]” — k for some m,k > 1. By 
Theorem 8.1.1, c has a c’”)-computable infinite homogeneous set, and so it has a 
solution in S. We conclude S is a model of RT, as was to be shown. 

Next, fix m and suppose RT is computably reducible to m applications of RT; . For 
each instance d of RT5, fixad™” -computable solution, Hy. Using Theorem 8.2.2, fix 
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a computable instance c: [w]’”"”"*! — 2 of RT with no 0””"”)-computable solution. 
Now there must be a sequence of instances do,...,dm-_—1 of RTS such that dg <7 c 
and fori > 0, di; <r c@ Ha, ®-:: © Ha, ,, and such that c @ Hy, @--: @ Ha,,_, 
computes an RT-solution to c. But c ® Ha, @-+: ® Hag,,, <r @"”, and @"” 
computes no solution to c, so we have reached a contradiction. Oo 


8.3 Seetapun’s theorem 


Two decades after Jockusch blew open the investigation into the effective content 
of Ramsey’s theorem, the main outstanding question concerning the computability 
theoretic strength of RT” was finally answered. The solution, showing that RT” does 
not code the jump, was obtained in the early 1990s by Seetapun, while still a graduate 
student at UC Berkeley. His result has spurred on much of the work in computable 
combinatorics over the past thirty years. 


Theorem 8.3.1 (Seetapun; see [275]). RT? admits cone avoidance. 
Corollary 8.3.2. Over RCAo, RT? does not imply ACAo. 


In this section, we present Seetapun’s original argument, which is in many ways 
more direct but more combinatorially involved. In Section 8.5, we will give a second, 
more computability theoretic proof, as an application of the Cholak—Jockusch— 
Slaman decomposition. 

Before proceeding, we make one general remark that will make the rest of our 
discussion simpler and more natural. 


Remark 8.3.3 (Colorings on general domains). Although we formulated Ramsey’s 
theorem in Definition 3.2.5 in terms of colorings on w, we can also consider the more 
general version, where an instance is a pair (X,c) for an infinite set X and coloring 
c: [X]” — k, and a solution is an infinite homogeneous set for c contained in X. 
For definiteness, let us denote this problem, and its associated VA theorem form, by 
General-RT;,.. Then the following proposition justifies our moving freely between it 
and the original version in most situations. 


Proposition 8.3.4. 


1. Foralln,k > 1, RT; =w General-RT;. 
2. RCAg + (Vn)(Vk)[RTZ < General-RT;’]. 


Proof. We prove (1), and leave (2) to the reader. Evidently, RT; is a subproblem 
of General-RT;, so trivially RT; <w General-RT{. In the other direction, fix an 
instance (X,c) of General-RT;.. Let p: w — X be the principal function of X, and 
define a coloring ¢: [w]" — k as follows: for x < y, let c(x, y) = c(p(x), p(y)). 
Note that C is uniformly computable from (X,c). Now suppose H is any infinite 
homogeneous set for c. Then H = {p(x) : x € H} is an infinite subset of X, 


uniformly computable from X © H. It is easy to see that H is homogeneous for c. O 
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Here, and in other constructions of homogeneous sets for colorings, we will use 
the following elaboration of Mathias forcing. 


Definition 8.3.5. For each k > 1, k-fold Mathias forcing is the following notion of 
forcing. 


1. The conditions are tuple (Eo,..., Ex—-1, 2) such that for each i < k: 


¢ FE; is a finite set, 
¢ J is an infinite set, 
°F; <I. 


2. Extension is defined by (E9,..., £,_,,/°) < (Eo, F1, ) if 


*foreachi< k, E; CES CE; UT, 
ef cl. 


3. The valuation of (Fo,...,£x-1,/) is the string o € 2<® of length k - min /, 
where, for eachi < k and x < J, o(kx +i) = 1 if and only if x € Ej. 


So, if (Eo, ..., Ex-1, 1) is a condition then for each 1, (E;, J) is a Mathias condition, 
and indeed, so is (Ep ®--- ® Ex_1, /). A generic filter here thus determines an object 
of the form G = Hp) ®--- ® Ax-}. 

We can restrict this forcing in various ways as with ordinary Mathias forcing, 
e.g., by asking for the reservoirs I be to be computable or low or cone avoiding. For 
convenience, we add the symbol H; to our forcing language to refer to the component 
H; of the generic object. Note that this does not actually alter the forcing language. 
We are merely defining x € H; to be an abbreviation for x € GA (Ad < x)[x = kdt+il]. 


Remark &.3.6. The way this forcing is used in proofs of Ramsey’s theorem is to add, 
e.g., the following clauses to the definition of a condition (Eo,..., £x-1,/) for a 
given coloring c: [w]* > k. 


* c(x, y) =i for all (x, y) € [E;]?. 
*c(x, y) =i for all x € E; and y € I. 


In this case, the generic object G = Hp © --- © Ax-_; satisfies that each H; is 
homogeneous for c with color 7. (However, H; may not be infinite). 


An important fact used in the proof of Seetapun’s theorem, and as we will see, 
other computability theoretic constructions, is the following. In some sense it is just 
a logical observation, but its utility is such that we present it as a lemma in its own 
right. 


Lemma 8.3.7 (Lachlan’s disjunction). Fix k > 1, let Ag,..., Ax_1 be sets, and let 
{Re : e € w} be acountable family of requirements. If, for all tuples eo, ...,€k-1 € W 
there is ani < k such that A; satisfies Re,, then there is ani < k such that A; satisfies 
Re for alle € w. 
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By considering the contrapositive, the proof of the above fact is trivial. But the 
result is very useful indeed. A typical use is the construction a homogeneous set 
for a k-coloring satisfying some collection of requirements as above. In this case, 
a common strategy is to build, for each i < k, a homogeneous set H; with color 
i. We can almost never ensure that each H,; satisfies all the requirements, but by 
playing the different homogeneous sets off against each other, we can often show 
that for each tuple eo,..., ex—1, there is at least one 7 such that H; satisfies the e;th 
requirement. If one thinks of this construction as happening dynamically, where each 
H; is built up either by stages, or in the case of a forcing construction, condition by 
condition, then it is usually not possible to know such an i “in advance”. Lachlan’s 
disjunction ensures that we do not need to. Instead, the construction can complete 
and the appropriate 7 found “at the end”. 

Let us move on to proving Seetapun’s theorem, where we will see a specific 
illustration of using Lachlan’s disjunction. As mentioned, this uses a very clever 
combinatorial set-up. 


Proof (of Theorem 8.3.1). By Corollary 4.5.10 and Theorem 4.6.9, we have that 
RT? <y Al. Hence, it suffices to prove the result for RTS. As usual, we deal with 
computable instances, with the general result following by relativization. So fix 
C <r @ and let c: [w]* — 2 be a computable coloring. Seeking a contradiction, 
suppose c has no infinite homogeneous set H such that C ¢7 H. 

We force with 2-fold Mathias conditions with the additional clauses mentioned 
in Remark 8.3.6 and with C-cone avoiding reservoirs. (In the general case, where c 
is not necessarily computable, we use reservoirs J with C <7 c@/.) Our assumption 
about the homogeneous sets of c implies that if Hp © Hy is a sufficiently generic 
object for this forcing then Hp and Hj are both infinite. Indeed, fix any condition 
(Eo, E,, 1). Fix i < 2. If there were no x € J such that c(x, y) =i for infinitely many 
y € I, then J would have an infinite /-computable (hence C-cone avoiding) subset 
that is homogeneous for c with color | —i. Since this is impossible, there must indeed 
be such an x. Setting E7 = E; U {x}, E\_, = E:-i, and I“ = {y € I: y > x} yields 
an extension (Eo, Ej}, /*) of (Eo, £1,/) with |E7| > |E;|. Since i was arbitrary, it 
follows that for each n, the set of conditions (Eo, E;, 7) with |E;| > n for each i is 
dense. 

It thus suffices to show that for a sufficiently generic object Ho © H there is an 
i < 2 such that H; satisfies the following requirements for every e € w: 


Re: (Aw)>[ BF (w) |= C(w)]. 


As discussed above, we will not try to ascertain this i right away. Instead, we will 
satisfy the following modified requirements for all e9, e; € w: 


Ree: (Bi < 2)(Aw)>( OF (w) [= C(w)]. 


From here, the fact that either Hp or H; satisfies all the requirements R, follows by 
Lachlan’s disjunction. 
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So fix e9,e; € w along with a condition (Fo, F;, 7). We claim that there is an 
extension (4, E;, /*) forcing that there is an i < 2 such that 
(Aw)=[®2i (w) [= C(w)]. (8.1) 


Since (Eo, £1, /) is arbitrary, it follows that the set of conditions forcing (8.1) for 
somei < 2 is dense. By genericity, this implies that every sufficiently generic Hp @ 
satisfies RZ, .,, as desired. 

To prove the claim, we first need a series of definitions. For eachi < 2,a nonempty 


finite set F is an i-blob if the following hold. 


e FCI(so£; < F). 

* c(x,y) =i forallx,y € F. 

* There isaw > E; and foreach j < 2asubset F/ C F such that OEP” (w) l= J, 

with use bounded by max F/. 
Thus, in particular, the sets F° and F! satisfy DE (wy) l# per’ (w) |, and so 
one of these two computations differs from C(w). 
A Seetapun sequence is an infinite sequence Fo, Fi, ... of 0-blobs with F, < F541 

for all s. The Seetapun tree determined by such a sequence is the set of all a € wS® 
satisfying the following. 


* a(s) € F, forall s < Ja}. 
* There is no 1-blob F C range(a*). 


The Seetapun tree is indeed a tree, and it is finitely branching. Figure 8.1 provides a 
visual. The clause that the 1-blob F be contained in range(a*), rather than range(@), 
is for convenience. It means that every terminal a € T has a 1-blob in its range. 

We proceed to the construction of the extension (E;, E i I*). There are two cases 
to consider. 


Case 1: For somei < 2, there exists an infinite C-cone avoiding subset of I containing 
no i-blob. Let the subset of J in question be J*, and let E5 = Ep and Ei = E). 
Clearly, (E;, Bs I*) extends (Eo, £1, /), and we claim that it forces (8.1). Consider 


any w > E, = Eo. By assumption, there do not exist F °F! C I* homogeneous for 


c with color i such that or sid (w) |= j for each j < 2, else the union of these sets 
would be an i-blob. Hence, if OEUF (w) {< 2 for some F ¢ J* homogeneous for 
c with color i, then the value of the computation depends only on w, and not on F. 
Denote this value by v,,. Thus, v,, is defined if and only if some F € J* as above 
exists. Note that we can search for such an F computably in /*, meaning that v,,, if 
defined, can be found uniformly /*-computably from w. It follows that if v,, is defined 
for every w, then the sequence (v, : w € w) is 1*-computable and hence does not 
compute C. This leads to the conclusion that there must be a w such that either v,, 


is undefined or v, # C(w). But if some extension (Eo, E, 1) of (E95, Ej, 1°) forced 


Oe! (w) J= C(w), then E;\ E; would be precisely an F € J* witnessing that vy is 
defined and equal to C(w). Hence, (Ej, E}, 1") must force (08! (w) J= C(w)), as 
wanted. 
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Case 2: Otherwise. We first use the failure of Case 1 to construct a Seetapun sequence 
Fo, F\,.... Let Fo be any 0-blob contained in J, which exists by hypothesis, and 
suppose that we have defined F, for some s € w. Let F;41 be any 0-blob contained in 
{y € 1: y > Fs}, which again exists by hypothesis. Clearly, this Seetapun sequence 
is computable in J. 

It follows that the Seetapun tree corresponding to this sequence is also I- 
computable. As T is finitely branching, then if it is infinite may fix a C-cone 
avoiding path P through it, using Theorem 2.8.23. Now F; < Fs; for all s, so 
also P(s) < P(s +1). Hence, range(P) is computable from P, and therefore is in 
particular C-cone avoiding. But by definition of the Seetapun tree, range(P) contains 
no |-blob, which contradicts Case | not holding. 

We conclude that T is finite. There is thus an m € w such that every a € T 
has length at most m. Thus, the only elements of our Seetapun sequence needed to 
construct T are the F; for s < m. By considering the elements of U,<,, F's in turn, we 
can thin out J to an J-computable infinite set J C 7 with the property that for every 
x € Usem Fs there is a color i, < 2 such that c(x, y) = iy for every y € J. More 
specifically, enumerate the elements of L),<,, F's aS X0,.--;Xv—1- Define Jo = J, and 
suppose inductively that we have defined J, for some u < v. Let i, be the least i < 2 
such that there are infinitely many y € J, with c(x,, y) =i. Then let J,,4; be the set 
of all such y. Finally, take J = J,. 

Notice that for any finite set F C U,<» Fs, if F is homogeneous for c with 
color i and every x € Fy has ix = i, then setting E; = E; UF, E\_ ; = E\_;, and 
T= {y € J: y > F} produces an extension (Eo, Ei, 1) of (Eo, Fy, 1). 

We now consider two subcases. 


Subcase 2a: There exists s < m such that ix = 0 for all x € Fs. Fix such an s. 

By definition, there is a j < 2 such that pour: (w) Lt C(w), with the use of 

the computation bounded by max F;. J. Let Ej = Eo U Fi , let E} = FE), and let 
={yeJ:y> Fi}. Then (£4, E}, /*) is the desired extension of (Eo, Fi, /). 


Subcase 2b: Otherwise. For each s < m, fix xs € Fs withi,, = 1. Let a be the string 
of length m with a(s) = x, for each s < m. Then either a ¢ T or q is terminal in T. 
Either way, range(q@) contains some 1-blob, F. By definition, there is a 7 < 2 such 
that of ne (w) |# C(w), with the use of the computation bounded by max F/. Let 
E> = Eo, let Et = E, U F/, and let * = {y € J: y > F/}. Then (Ej, Ej, I*) is the 
desired extension of (Eo, £1, J). oO 


8.4 Stability and cohesiveness 


The difference between RT? and RT” for n > 3 exposed by Seetapun’s theorem 
created an incentive to understand Ramsey’s theorem for pairs more deeply and in 
new ways. One such approach that proved extremely fruitful was invented in the 
seminal work of Cholak, Jockusch, and Slaman [33]. The idea is to decompose 
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O {) 


Figure 8.1. An illustration of a Seetapun tree. The bottom node is the root, (). Each horizontal 
rectangle above this is a 0-blob, with its elements represented by the dots inside. The vertical 
arrangement of 0-blobs represents a Seetapun sequence. Highlighted are five different nodes in the 
Seetapun tree, represented as paths via doubled lines. The nodes agree on their first four values. The 
dashed outline represents a 1-blob in the (common portion of the) ranges of these nodes. Hence, 
these nodes are maximal in the tree (i.e., they have length 5 and have no extensions in the tree of 
length 6). 


RT? into two “simpler” principles, each of which is some sort of elaboration on 
Ramsey’s theorem for singletons (RT!), which is a much simpler theorem to analyze. 
In this section we state these two principles, and in the next section we prove the 
Cholak—Jockusch-Slaman decomposition and discuss its significance. 


8.4.1 Stability 


The first way to “simplify” Ramsey’s theorem for singletons is to restrict to so-called 
stable colorings. 


Definition 8.4.1. Fix k > 1, an infinite set X, and a coloring c: [X]* > k. 


1. For x € X, an infinite set Y C X, andi < k, we write limycy c(x, y) = i if 
c(x, y) =i for almost all (i-e., all but finitely many) y € Y. 

2.A set L C X is limit homogeneous for c if there is an i < k such that 
limyez c(x, y) =i for all x € L. 

3. c is stable if for each x € X there is ani < k, called the limit color of x under c, 
such that limyex c(x, y) = 1. 

4. If c is stable, the coloring d: X — k defined by d(x) = limyex c(x, y) is called 
the coloring of singletons induced by c. 


By analogy with homogeneous sets, we may also say the limit homogeneous set L 
in the above definition is limit homogeneous with color i. Note that if c is stable and 
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L C X is an infinite limit homogeneous set for c with color i then we not only have 
limyez c(x, y) =7 for all x € L but also limy cx c(x, y) =i. When X = w this agrees 
with Definition 2.6.4, so we typically just write lim, in place of limye,). 

It is easy to see that every infinite homogeneous set is limit homogeneous. The 
converse is false. For example, consider c: [w]? — 2 defined by c(x,x + 1) = 0 
and c(x,y) = 1 for all y > x + 1. Then lim, c(x, y) = 1 for all x, so w is limit 
homogeneous for c. But of course, w is not homogeneous for c. To amplify on this, 
homogeneity cares about the “local” behavior of the coloring (i.e., what color is 
assigned to each particular pair of elements) and as such is still a property of pairs, 
even for stable colorings. Limit homogeneity only cares about the “global” behavior 
(i.e., the limit colors), and as such is really a property of singletons. (We will say 
more about this shortly.) However, every limit homogeneous set can be “thinned” 
out to a homogeneous one, and effectively so. 


Proposition 8.4.2. Fix k > 1 and a coloring c: [w|* — k. If L is an infinite limit 
homogeneous set for c then c has a (c ® L)-computable infinite homogeneous set 
ACL. 


Proof. Enumerate the elements of L as x9 < x1 < --:. Fixi < k such that L is limit 
homogeneous with color i. Let ng = 0, and suppose that for some s > 0 we have 
defined numbers no < --- < ns. Since limy c(x, y) = i for all x € L, there exists a 


number n > n, such that c(Xp,,Xn) =iforallt < s. Let n,4; be the least such n. Then 
by induction, for all t < s we then have that c(xy,,n,) =i. Thus, H = {x,, : s € w} 
is an infinite homogeneous subset of L, and clearly H is computable from c @ L. O 


Remark 8.4.3 (Complexity of determining limit colors). Givenacoloring c: [w]? > 
k, determining whether an element x has a limit color is uniformly pag in x, since 


lim c(x, y) exists @ (Gi < k)(Az > x)(Vy 2 z)[c(x, y) =i]. 
y 


But if we know that x has a limit color, then determining its value is uniformly i 
in x, as 
[c(x, y) =i] 


lim, c(x, y) =i © (Az > x)(Vy 2 2) 
> z)[c(x, y) =i]. 


© (Wz > x\(4 


SS 


Proposition 8.4.4. Fix k > 1 and a stable coloring c: [w|? — k. Let d be the 
coloring of singletons induced by c. Then d is uniformly c’-computable, and its 
infinite homogeneous sets are precisely the limit homogeneous sets of c. 


We leave the proof of the proposition to the reader. Building homogeneous sets for 
colorings of singletons is much easier than for colorings of pairs. Thus, Proposi- 
tions 8.4.2 and 8.4.4 can be thought of as saying that for stable colorings of pairs, 
we can trade the combinatorial complexity of building homogeneous sets for the 
computational complexity of finding limit colors. 

Using the limit lemma, we can obtain a kind of converse to the preceding propo- 
sition. 
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Proposition 8.4.5. Fix k > 1, a set A, and an A’-computable coloring d: w > k. 
Then there exists a uniformly A-computable stable coloring c: [w|* — k so that d 
is the coloring of singletons induced by c. 


Proof. By the limit lemma relativized to A, we may fix an A-computable ap- 
proximation to d. This is a function d: w* — w such that for every x and 
every sufficiently large y, we have d(x, y) = d(x). Define c: [w]? — k as 
follows: for x < y, let c(x,y) = d(x, y) if d(x, y) < k, and otherwise let 
c(x, y) = 0. By the uniformity of the limit lemma, c is uniformly A-computable. 
Also, lim, c(x, y) = limy d(x, y) = d(x), as desired. oO 


From the computational point of view, we conclude that finding limit homogeneous 
sets for A-computable stable colorings of pairs is the same as finding homogeneous 
sets for A’-computable colorings of singletons. Frequently, it is convenient to think 
of the latter as an actual A finite partition (Po,..., Px-1) of w. On that view, 
we are just looking at infinite subsets of the P;. In particular, if k = 2, finding an 
infinite limit homogeneous set of a given A-computable stable coloring is the same 
as finding an infinite subset of a given a set or of its complement. 
We can formulate the discussion in terms of V5 theorems and problems. 


Definition 8.4.6 (Stable Ramsey’s theorem). 


1. For k > 1, the stable Ramsey’s theorem for k-colorings (SRT?) is the following 
statement: every stable coloring c: [w]” — k has an infinite homogeneous set. 

2. For k 2 1, the AS k-partition subset principle (Dz) is the following statement: 
every stable coloring c: [w]” — k has an infinite limit homogeneous set. 


SRT? and D? are defined as (Vk)SRT? and (Vk)D?. 


Each of these has a problem form obtained from the theorem form in the standard 
way. As usual, we will shift perspectives between the forms freely depending on 
context. For example, one immediate corollary of Proposition 8.4.4 is the following 
result. 


Corollary 8.4.7. SRT* admits AS solutions. 


As in Remark 8.3.3 and Proposition 8.3.4, we can apply the above principles also 
to colorings defined on other infinite subsets of the natural numbers in Weihrauch 
reductions and proofs over RCAg. 

How do SRT? and Dp, compare under our various measures for studying different 
problems and theorems? For starters, the following proposition is an immediate 
consequence of Proposition 8.4.2. 


Proposition 8.4.8. For each k € w, SRT? =. Be 


By contrast, it turns out that SRT; <w DS and SRT; Ks Dee as we show in Theo- 
rem 9.1.18. For now, the most relevant reducibility to consider is provability under 
RCApo. Obviously, RCAg + (Wk) [SRT; > Dz]. In the other direction, it may seem 
straightforward to formalize Proposition 8.4.2 in second order arithmetic. But there 
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is a hitch. Suppose that, in an arbitrary model M of RCAo, we have ac that M thinks 
is a stable 2-coloring of pairs. Suppose we have numbers xo,...,Xa—-1 € M, each 
with the same limit color i < 2. How do we know there is a number x >™ xg_1 such 
that c(xp,x) =i for all b <™ a? The natural justification—for each b <™ a, choose 
Sp > Xp So that c(xp, y) =i for all y >™ sy, and then let x = max{s, : b <™ a}— 
actually uses Bx). Since M may not satisfy Be, we need first of all to show that 
this follows from D5 itself. 


Theorem 8.4.9 (Chong, Lempp, and Yang [34]). RCAg + D5 — BX). 


Proof. Recall the principle PART from Definition 6.5.7. By Theorem 6.5.8, this is 
equivalent to BR. It thus suffices to show that D5 — PART. 

We argue in RCAg. Let <z, be a linear ordering of N such that (N, <z,) is of 
order type w + w”*. To show that it is strongly of order type w + w*, let {€ = 
Xo <n ++: <_ Xp = g} be a finite set, where ¢ and g are the least and greatest 
elements under <;,. And seeking a contradiction, suppose that for each i < k the set 
{y EN: x; <x y <x xj41} is finite. We claim that for each i < k, 


{y €N: x; <z y} is infinite. (8.2) 


This is obvious for i = 0, and we should like to prove it for all i by induction. 
However, we cannot do so directly since it is a 1 statement. Before addressing this, 
note that once this claim is proved we have our contradiction: since xx41 = g we 
have that 

{ye Ni xg <p y}={y EN: xe <7 y <7 Xes}- 

To prove (8.2), define c: [N]* — 2 as follows: for all x < y, let c(x,y) = 1 
if x <z y, and let c(x,y) = O if y <, x. Note that c is stable. Indeed, fix x. By 
assumption, exactly one of {y € N: y <, x} and {y € N: x <z_ y} is infinite. In the 
former case, c(x, y) = 0 for almost all y. In the latter, c(x, y) = 1 for almost all y. 

We can thus apply D5 to obtain an infinite limit homogeneous set L for c. If L has 
color 0, then for every x € N we have 


{ye N:x <z, y} is infinite — (Vy € L)[y <7 x], 
while if L has color 1 then similarly 
{ye N:x <z y} is infinite ~ (Ay € L)[x <, y]. 


It follows in particular that (8.2) is equivalent to either a x of mm? formula in 7, and 
therefore can be proved by induction. oO 


Remark 8.4.10. By Theorem 6.5.1, Bx) is equivalent over RCAo to RT!, and it is 
easy to see that RT! is a consequence of D*. Thus the emphasis in the above theorem 
is on the fact that we can obtain Bx? even when the number of colors in the instances 
of D? is fixed. It is much more straightforward to show separately that SRT; implies 
RT! over RCAg (Exercise 8.10.3). 
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Corollary 8.4.11. RCAg + (Vk)[SRT; © D7]. 


Proof. As noted above, RCAg + (Vk) [SRT; > I. Obviously, RCAp + D} > 
SRTT. If k > 2 then De => are so RCAp + De F Bz by the preceding proposition. 
Now the argument of Proposition 8.4.2 can be carried out. oO 


The main import of these results is that, in most situations, we can use SRT? 
and De interchangeably. This is useful because, as we will see, constructing limit 
homogeneous sets is easier than constructing homogeneous ones. This makes sense: 
as discussed earlier, for limit homogeneous sets we do not care about “local” behavior, 
so there are fewer things to ensure. In particular, working with De or D? is usually 
more convenient than directly with SRTz or SRT?. 


8.4.2 Cohesiveness 


We now move on to the second “simplification” of RT’, which is the principle COH 
from Definition 4.1.6. This is far less obviously related to Ramsey’s theorem, but as 
a first pass we can look at the following definition and result. 


Definition 8.4.12. Let P be an instance solution problem all of whose solutions are 
subsets of w. Then P with finite errors, denoted Pf. is the problem whose instances 
are the same as those of P, and the P'°-solutions to any such instance X are all sets 
Z =" Y for some P-solution Y to X. 


Thus, for example, consider (ATZ)*. A solution to a given coloring c: w — 21s now 
any infinite set which is almost homogeneous, i.e., homogeneous up to finitely many 
modifications. COH, it turns out, is essentially the parallelized version (as defined 
in Definition 3.1.5) of this principle, i.e., Gin (A note of caution: for a problem 
P as above, Pf and Pf are not the same!) This connection is not a priori obvious, 
even if it may seem to be. By considering characteristic functions, the instances of 


COH and (RT)® are seen to be the same. Given a sequence of colorings w — 2, an 


(RT5)-solution is a sequence of almost homogeneous sets, none of which needs to 
have any relation to any of the others. A solution to COH, on the other hand, is a single 
set that is almost homogeneous for all the colorings in the instance simultaneously. 


Theorem 8.4.13 (Jockusch and Stephan [167]). Fix a set A. The following are 
equivalent for a set X. 


1. X' > A’. 
2. X computes a solution to every A-computable instance of COH. 


. . lyf 
3. X computes a solution to every A-computable instance of (RT3)*. 


Proof. (1) — (2): Fix X’ > A’ and an A-computable instance (R; : i € w) of 
COH. For each o € 2“, we define R, inductively as follows: if 7 = (), we let 
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Ro =; if o = 71 then Rg = RN Riz; if 7 = 70 then Ro = RM Riz}. Note 
that if R; is infinite then so is at least one of R-o and R;;. Moreover, there is at 
least one string o of every length such that R,, is infinite. Since X’ > A’, it follows 
by Theorem 2.8.25 that X’ computes a function f: w — 2<® such that for each 
n, |f(n)| =n and R¢ (n) is infinite. Fix an X-computable limit approximation to f, 
which we may take to be a function f: w* — 2< such that |f(n, s)| =n for all n 
and s, and flr, s) = f(n) for all sufficiently large s. We may further assume that if 
n <m then fim, s)< f(n, S) for all s. 

We now define a sequence of numbers x9 < x; < --:. Form € w, search for 
the least s > n and the least x > max{x,, : m < n} such that x € Rens)? and 
let x, = x. Note that the search for x,, must succeed since for R et i R¢(n) 
for all sufficiently large s and Ryn) is infinite. Let S = {x, : n € w}. Then S is 
X-computable and infinite, and we claim it is cohesive. Indeed, fix 7 and let n be such 


that RF iat.) = R¢ (+1) for all s > n. Then as Xm for m 2 n belong to R¢ (+1) 
and so either to R; (if f(i+ 1)() = 1) or to R; Gf fi + I) = 0). Hence, either 


S c* R; or S C* Rj, respectively. 
(2) — (3): Immediate, as (RT3)& is a subproblem of COH. 


(3) — (1): We define an A-computable instance (R; : i € w) of (RI, as follows. 
Fix 7. Using a fixed A-computable approximation to A’, we can approximate on’ (i), 
letting 4'(i) [s] denote its value (if any) at stage s of this approximation. For each 
s, if 4’ (i)[s] |= 0 let R;(s) = 1. Otherwise, let R;(s) = 0. Now suppose S <p X 
is an infinite cohesive set for (R; : i € w). We define an X’-computable function f 
as follows: given i, first use X’ to compute whether S C* R; or S C* R;, and then set 
f() = 1 in the first case and f (7) = 0 in the second. For each /, if Ga) J= 0 then 
for all sufficiently large s we have 4’) [s] |= 0, and hence R;(s) = 1. In this case, 
we thus have f(z) = 1 # 4'(i). If 4'(i) {# 0 then we instead have R;(s) = 0 for 
all sufficiently large s, hence f(z) = 0. We conclude that f is DNC relative to A’, 
which means that X’ >> A’, as wanted. oO 


The proof of the last implication actually exhibits an A-computable instance 
of COH, every solution to whichcomputes a solution to every other A-computable 
instance. In the parlance introduced in Exercise 4.8.1, this yields: 


Corollary 8.4.14. COH admits universal instances. 


This gives us a nice degree theoretic characterization of the w-models of COH. 
Namely, it follows that an w-model M satisfies COH if and only if, for every 
A €S™, there is an X € S“ with X’ > A’. 

Observing the uniformity in the proof of Theorem 8.4.13 also yields the following 
more direct relationship. 


Corollary 8.4.15. COH =w (ATE, 


All told, we find that COH, like SRT} and De is some kind of variation on Ramsey’s 
theorem for singletons. Unlike SRT; and De however, it is not at all evident that RTS 
implies COH (over RCAg, or in any other sense). In fact, it does, and in a strong way. 
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Theorem 8.4.16 (Cholak, Jockusch, and Slaman [33]). COH is uniformly identity 
reducible to RTS. 


Proof. Let (Rj : i € w) be an instance of COH. First, we may assume that for 
all x < y there is ani such that R;(x) # R;(y), i.e., R; contains one of x and y 
but not the other. (Indeed, we can immediately define a new family (R* : i € w), 
with R5, = R; and R5,,, = {i}, which has the desired property. This new family 
is uniformly computable from the first, and any infinite cohesive set for it is also 
cohesive for the original.) We now define a coloring c: [w]? — 2 as follows: given 
x < y, find the least i € w such that R;(x) # R;(y), and then output R; (x). Note that 
c is uniformly computable from (R; : i € w). Let H be any RT5-solution to c. We 
claim that H is an infinite cohesive set for (R; : i € w). We proceed by induction, 
showing that for each 7, either H C* R; or H C* R;. Fix i, and assume the result 
is true for all 7 < i. Let b be so that for each j < i, either every x > b in H is 
in R; or every x > Db in H is in Rj. Now if H intersects both R; and R; infinitely, 
we can fix b < x < y < zinH so that Rj(x) # R;(y) # R;(z). By assumption, 
R(x) = Rj(y) = Rj(<z) for every j < i. But then by definition of c, we must have 
c(x, y) # c(y, z), which is impossible since x, y,z € H and H is homogeneous for 
c. We conclude that either HM R; or HA R; is finite, ic., H C* R; or H C* R;, as 
desired. oO 


We conclude with the following proposition, which brings our conversation full 
circle back to stability. 


Proposition 8.4.17. Fix k > 1 anda coloring c: [w]* — k. There exists an instance 
of COH such that if S is any solution to this instance then c }{S]? is stable. 


Proof. For each x andi < k, let Ryx4; = {y > x : c(x, y) = i}, thereby obtaining 
an instance (R, : n € w) of COH. Suppose S is an infinite cohesive set for this 
family. We claim the stronger fact that lim, es c(x, y) exists for every x € w. Indeed, 
fix x. For each n, either § C* R, or S C* R,, hence by definition there must be 
ani < k such that S C* Rxx+;. Thus c(x, y) = 7 for almost all y € S, meaning 


limyes c(x, y) =i. Oo 


8.5 The Cholak—Jockusch—Slaman decomposition 


We can now put together all the pieces from the previous section and prove that RT” 
can be decomposed, or split, into the two “simpler” principles SRT” and COH. We 
will then look at several representative applications. 

The Cholak—Jockusch—Slaman decomposition is the following seminal result. 


Theorem 8.5.1 (Cholak, Jockusch, and Slaman [33]). RCAg + (Vk > 2) [RT;. eo 
SRT? + COH]. 
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The proof proceeds by formalizing our earlier arguments. But as may be expected, 
the main issues we run into are ones of induction. For ease of presentation, we break 
the proof into two lemmas. 


Lemma 8.5.2. RCA + (Vk > 2)[RTZ — SRTz + COH]. 


Proof (Jockusch and Lempp, unpublished; Mileti [211]). Obviously, RCAg proves 
(Vk)[RT;. > SRT? ]. We next show that RT; — COH. To this end, we would 
like to formalize the proof of Theorem 8.4.16. The only issue there is with the 
induction at the end. In RCAo, we cannot prove by induction for each 7 that either 
H <* R; or H C* Rj, as the latter is a Pa formula. As is often the case, we can fix 
this by being more careful. We now argue in RCAo. Recall that the coloring c that 
H is homogeneous for is defined as follows: for all x < y, if i is least such that R; 
contains one of x and y but not the other, then c(x, y) = R;(x). 

Say H is homogeneous with color v < 2. Let p: N — H be the principal function 
of H, which exists by AY comprehension. We aim to prove the following for each 7: 
for each finite set F C N, if R;(p(x)) = 1—-vand R;(p(x+1)) = 0 for all x € F, then 
|F| < 2!. Note that this is now a m1 formula, so we can prove it by induction. For 
i = 0, this is immediate. For if x € N satisfied Ro(p(x)) = 1—v and Ro(p(x+1)) =v 
then by definition of c we would have c(p(x), p(x + 1)) = Ro(p(x)) = 1l-v,a 
contradiction. So fix i > 0 and assume the result is true for all 7 < i. For each x € F, 
let jx be the least j such that R;(p(x)) # Rj(p(x+1)). As Ri(p(x)) # Ri(p(x+1)) 
we have jx, < i, but as p(x) < p(x +1) and R;(p(x)) # v we must in fact have 
jx <i. For each j < ilet Fj = {x € F: jy = j}. Then F = U;<; Fj, and from the 
inductive hypothesis it follows that 


|F| = IF < i 2 =2'-1, 


j<i j<i 


This proves the claim. 

It remains to verify, using the claim, that H is cohesive for (R; : i € N). Suppose 
not and fix an i such that both HN R; and H / R; are infinite. Then there exist finite 
sets F of arbitrary size satisfying R;(p(x)) = 1—v and R;(p(x + 1)) = v for all 
x € F. This is impossible by the claim. oO 


Lemma 8.5.3. RCAg + (Wk > 2)[SRTZ + COH — RT?]. 


Proof. The first step is to formalize Proposition 8.4.17. We argue in RCAg + SRT} + 
COH. Fix k > 1 and c: [N]* — k. Define the family (R, : n € N) as in Propo- 
sition 8.4.17, and apply COH to find an infinite set S cohesive for this family. As 
before, we know that for each n, either S C* R, or S C* R,. Fix x, and suppose 
towards a contradiction that for each i < k we had S C* Rx x4;. Then for each i < k 
there is a b such that y > b > y ¢ Ruy4;. By BE, which is a consequence of SRI, 
we may fix a b so that for y > b > (Vi < k)[y ¢ Rxxsi]. By definition, this means 
c(x,b+1) #1 for alli < k, which cannot be. So, we may fix i such that S C* Rxx+i, 
whence as before we conclude c(x, y) = i for almost all y € S. Thus, c [S$]? is 
stable. Applying SRT}, we obtain an infinite homogeneous set H C S for c [[S]?. 
Then H is also homogeneous for c, as desired. oO 
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We can also formulate a version of this result for <w. Recall the compositional 
product, *, from Definition 4.5.11. 


Theorem 8.5.4 (Cholak, Jockusch, and Slaman [33]). For all k > 1, RT; <w 
SRT? * COH. 


Proof. For completeness, we spell out the details. Given c: [w]* — k, let (Rn : 
n € w) be the instance of COH defined in Proposition 8.4.17. Let I be the Turing 
functional that, given a pair of sets (A, B) as an oracle, outputs the set {(x, y,7) € 
A: x,y € B}. In particular, if S is any infinite cohesive set for (R, : n € w) then 
T'(c, S) = ¢ }{S]?, which is an instance of SRT¢. The pair ((R, : n € w),T) is thus 
an instance of SRT? * COH, uniformly computable from c, and any solution to this 
instance is an infinite homogeneous set H C S for c [[S]* and hence for c. im 


Curiously, even though SRTZ is a subproblem of RT3, and COH <w RT5 by Theo- 
rem 8.4.16, we cannot improve the above to an equivalence. As shown by Dzhafarov, 
Goh, Hirschfeldt, Patey, and Pauly [81], COH * SRT; <w RT3. 

The main interest in the Cholak-Jockusch-Slaman decomposition, however, is in 
second order arithmetic, and ergo in Theorem 8.5.1. It is here that the majority of 
applications come from. 


8.5.1 A different proof of Seetapun’s theorem 


The Cholak—Jockusch—Slaman decomposition can be applied to obtain results about 
RT? by establishing them separately for SRT? and for COH. The main advantage 
here is that such proofs are more modular and thus more readily adaptable to other 
situations. A good example is Seetapun’s theorem, which we now re-prove using this 
approach. We will see several other examples in the next section, as well as in the 
exercises. 

The basic outline for proving cone avoidance of RT breaks down into the follow- 
ing two steps. 


1. Prove that COH admits cone avoidance. 
2. Prove that SRT? admits cone avoidance. 


To see how these fit together, let us look at computable instances for simplicity. 
Given a computable instance c of RT”, we apply cone avoidance of COH to obtain a 
cohesive set S that does not compute a given noncomputable set C. The restriction 
of c to this cohesive set is stable and S-computable. Hence, by cone avoidance of 
SRT3 (now relative to S) we can find an infinite homogeneous set H C S for c [[.S]? 
such that C ¢7 S ® H. And of course, H is homogeneous for c. 

As remarked earlier, it is much easier to work with D? in place of SRT”. But another 
simplification is to go even further, and work with RT! instead. By Proposition 8.4.4, 
we can pass from any D?-instance to an RT!-instance with the same solutions. 
However, we cannot simply replace SRT” by RT! in (2). The problem is that the RT! 
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instance is only computable from the jump of the D?-instance. In our example above, 
d would thus be S’-computable, so if C = @’, say, then we could very well have 
C <r d. In this case, cone avoidance of RT! would not by itself suffice to conclude 
that d has a solution H with C <7 S @ H. So instead, we prove the following: 


2’. RT! admits strong cone avoidance. 


Here, we recall the notion of strong cone avoidance from Definition 3.6.13. This is a 
stronger property than (ordinary) cone avoidance, but ensuring it is worth it for the 
benefit of being able to work with a coloring of singletons instead of pairs. Note that 
RT! admits strong cone avoidance if and only if D? does. So certainly (2’) implies 
(2). It is worth noting that we cannot prove strong cone avoidance for SRT? directly, 
even if we wanted to. 


Proposition 8.5.5. SRT} does not admit strong cone avoidance. 


Proof. Fix a @’-computable increasing f such that @’ is computable from any 
function that dominates f (Exercise 8.10.2.) Define c: [w]? — 2 by 
0 ify < f(x), 
c(x,y) = as 
1 otherwise, 
for all x < y. Thus, for each x and all sufficiently large y > x we have c(x, y) = 1, so 
f is stable. And any infinite homogeneous set for c must have color |. Let H be any 


such set, say H = {ho < hy < ---}. Then for all x we have hy4; > f(hx) > f(x), 
so the H-computable function x +» h,+4; dominates f. We conclude 2’ <r H. oO 


Thus, even though SRT? and D? are interchangeable in most contexts, they are 
different in certain aspects. 
We now proceed with the proof. First, we state the two key lemmas we will need. 


Lemma 8.5.6 (Cholak, Jockusch, and Slaman [33]). COH admits cone avoidance. 
Lemma 8.5.7 (Dzhafarov and Jockusch [86]).. RT! admits strong cone avoidance. 


Next, let us see in detail how these combine to yield Theorem 8.3.1. 


Proof (of Theorem 8.3.1 from Lemmas 8.5.6 and 8.5.7). By Corollary 4.6.12, we 
know that RT2 =); RTS, so it suffices to prove cone avoidance for the latter. So 
fix sets A and C with C <r A and an A-computable coloring c: [w]” — 2. Define 
an instance {R, : x € w} of COH by R, = {y > x : c(x, y) = 0} for all x € w. 
By Lemma 8.5.6, fix an infinite cohesive set S for this family of sets satisfying 
C <r A®S. By Proposition 8.4.17, c [[S]* is stable. Applying Lemma 8.5.7, we 
may fix an infinite limit homogeneous set L C S for c satisfying C ¢7 AOS @ L. 
Now by Proposition 8.4.2, we may fix an infinite c @ L-computable (and hence 
A @ L-computable) homogeneous set H € L for c TLS]? (and therefore c). We have 
C ¢7 A @ H, as desired. oO 


It remains to prove the lemmas in turn. The first is a simple application of facts 
we have already collected earlier in our discussion. 
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Proof (of Lemma 8.5.6). Fix sets A and C with C gy A, and let (R; : i € w) be 
any A-computable instance of COH. Consider Mathias forcing with conditions (E, /) 
such that J C wandC ¢7 AGI. By Example 7.3.7, relativized to A, every sufficiently 
generic set G satisfies C ¢7 A ®G, and by Example 7.3.9, every sufficiently generic 
set G is cohesive for the R;. Thus, any sufficiently Mathias generic set G is a solution 
witnessing cone avoidance of COH. oO 


Proof (of Lemma 8.5.7). Fix sets A and C with C ¢y A, along with a coloring 
c: w — k for some k. (Note that there is no hypothesis that c be A-computable, or 
effective in any other way.) We aim to produce an infinite homogeneous set H for c 
such that C ¢7 A @H. 

First, we define an infinite set X C wwithC <7 A®X andaset K C {0,...,k-1}, 
as follows. Define Xo = w and Ko = {0,..., k — 1}, and suppose we have defined X; 
and K; for some i < k. If there is an infinite set X* C X; such that C <7 A @ X* and 
c(x) #1 for all x € X*, then let X;,; = X* and let Kj; = K; \ {i}. Otherwise, let 
Xi+1 = X; and K+; = K;. Finally, set X = X; and K = Kx. By construction, for all 
i<k,ifie¢ K then c(x) =i for infinitely may x € X, and ifi ¢ K then c(x) # i for 
all x € X. For ease of notation, assume K = {0,...,€—1} for some € < k. 

We force with €-fold Mathias conditions (Eo,...,E¢-1,/) within X with the 
following additional clauses. 


¢ For eachi < ¢, c(x) =i for all x € Ej. 


*Ce<rAOl. 
By choice of X and K, it is easy to see that for eachi < € and each n € w, the set of 
conditions (Eo, ..., Ee-1, 1) with |E;| > n is dense. Hence, any sufficiently generic 


object for this forcing will have the form G = Hp ® --- © He_,, where each H; is an 
infinite homogeneous set for c with color i. 

Next, fix € indices eo,...,@¢_; € w. We claim that the set of conditions forcing 
that there is ani < € such that 


(Aw)7[ 2 (w) |= C(w)] (8.3) 


is dense. Fix any condition (£o,..., £¢-1, 1). We exhibit an extension forcing (8.3). 
Consider the class C of all sets Z = Zp) ®--- © Ze_ as follows. 


¢ Zo,...,Ze-1 partition /. 
¢ There is noi < ¢€ and w > min/ such that there exists Fo, fF, C Z; with 


DAOFVLFO) (yy l# GO Cy) { (with uses bounded by 2 max Fo and 
2 max F}, respectively). 


Notice that C is a TI)(A ® I) class. We investigate two cases. 


Case I: C # @. By the cone avoidance basis theorem (Theorem 2.8.23) relative to 
A ® I, we may fix Z = Zy) ®--- ® Ze_; € C such that C ¢7 A @ Z. Since J is 
infinite and the Z; partition J, there must be ani < @ such that Z; is infinite. Fix such 
an i. Then the condition (E5,...,E£7_,,/"°) = (Eo,..., Ee-1, Z) is an extension of 
(Eo,...,£¢-1,/) forcing that there is a w > min/J for which 029% (yw) T, which 
implies (8.3). 
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Case 2: C = @. In this case, by compactness, we may fix a level n € w such that 
if Zo,..., Ze_1 partition J then there is ani < €,a w > max/, and sets Fo, Fi © 
Z; }n such that a) l# 02°F (w) |. Then there is b < 2 such that 
eam (2) # C(w). In this case, for j # i, set E* = Ej, set Ey = E;U Fp, and set 
P={x eI: x > Fp}. Then (E5,... ppt) is an extension of (Eo,..., E¢-1,/) 
forcing that there is a w > min/ for which ont (w) |# C(w). 


By Lachlan’s disjunction, we conclude that if G = Hp ®--- @ He_, is sufficiently 
generic then there is ani < € such that C <7 A © H;. Combined with our earlier 
observation that H; is homogeneous for c completes the proof. oO 


8.5.2 Other applications 


Another application, broadly following the same outline as the alternative proof of 
Seetapun’s theorem in the previous section, is the following. 


Theorem 8.5.8 (Cholak, Jockusch, and Slaman [33]). Fix A € 2° and X > A’. 
Every A-computable instance of RT’ has a solution H satisfying (A ® H)' <r X. 


This has the following corollary, which complements Seetapun’s theorem 
Corollary 8.5.9. RT? admits low? solutions. 


Proof. Fix A and by the low basis theorem (Theorem 2.8.18) relative to A’, choose 
X > A’ with X’ <z A”. Applying Theorem 8.5.8, every A-computable instance c 
of RT? has a solution H such that (A ® H)’ <7 X. Hence (A @H)” <p X’ <p A”.0 


Of course, by Theorem 8.2.2, RT” omits x —hence A’. and so certainly low— 
solutions. So the above cannot be improved from low? to low. Since no low2 set can 
compute @’, Corollary 8.5.9 establishes certain kinds of cone avoidance for RT”. In 
fact, it is possible to add in full cone avoidance. 


Theorem 8.5.10 (Dzhafarov and Jockusch [86]). Fix A ¢ 2° and C ¢y A. Every 
RT? instance has a solution H such that (A ® H)" <7 A" and C ¢€7 A@ H. 


We omit the proof, which uses techniques similar to those we will see below. 

Let us move on to the proof of Theorem 8.5.8. Using the Cholak—Jockusch— 
Slaman decomposition, we split the proof into a version for COH and a version for 
SRT}. We begin with the former. 


Lemma 8.5.11. Fix A € 2° and X > A’. Every A-computable instance of COH has 
a solution G satisfying (A ® G)’ <y X. 


Proof. We prove the result for A = @. The general case easily follows by relativiza- 
tion. Fix X > @’ and a computable instance R = {R; : 1 € w} of COH. Consider 
Mathias forcing with computable reservoirs. We prove two claims. 
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Claim 1: For each e € w, the set of conditions deciding e € G’ is uniformly X- 
effectively dense. Indeed, fix any condition (£, /). Then @’ (and hence X) can tell if 
there is a finite F C E such that ®£Y"' (e) | (with use bounded by max F, as usual). 
If so, @’ can find the least such F, and then (E*, /*) = (EU F,{y € 1: y > F}) is 
an extension of (£, /) forcing e € G’. On the other hand, if there is no such F then 
(E, I) already forces a(e € G)’. 


Claim 2: For eachi € w, the set of conditions forcing G C* R; VG C* R; is uniformly 
X-effectively dense. Fix any condition (£,/). Note that “J. R; is infinite” and 
1“7 A R; is infinite” are both I1(2’) statements, at least one of which is true. By 
Theorem 2.8.25 (4), relativized to @’, X can uniformly computably identify one of 
these two statements which is true. If it is the former, we let (E*, /*) = (E,1M R;), 
which forces G C* R;. If it is the latter, we let (E*, J*) = (E, 1M Rj), which forces 
GC" R;. 

To complete the proof, note that for each n, the collection of conditions (E, J) 
with |E| = n is uniformly computably dense (even though the forcing is only 
@’’-computable). Hence, by Theorem 7.5.6 (as in Example 7.5.7) there is an X- 
computable generic sequence (Eo, Jo) > (£i,/;) > --- such that G = U, Es € w® 
and for all s, if s = 3e+ 1 then (E;, /,) decides e € G’, andif s = 3e +2 then (Ey, I) 
forces G C* R; V G C* R;. It follows that G is R-cohesive. And for each e, e € G’ 
if and only if pes! (e) | (with use bounded by £3.41). This can be checked by X 
since it knows the entire sequence of conditions, so G’ <7 X, as desired. oO 


Now let us prove a version of Theorem 8.5.8 for SRT}. We emphasize we are 
looking at SRT2, not SRT’. 


Lemma 8.5.12. Fix A € 2° and X > A’. Every A-computable instance of SRT; 
has a solution H satisfying (A ® H)’ <y X. 


Proof. Again, we prove just the unrelativized version. Fix a computable stable col- 
oring c: [w]? — 2. If c has a low infinite homogeneous set, we may take this to be 
H. Then H’ <y @’ <r X, so we are done. So suppose c has no low infinite homo- 
geneous set. For ease of notation, for each i < 2 define P; = {x : limy c(x, y) = i}. 
By Remark 8.4.3, Po and P are both @’-computable. Our assumption implies that 
neither has a low infinite subset, as this would be a low limit homogeneous set for c, 
which could then be thinned to a low homogeneous set using Proposition 8.4.2. 

We produce an infinite subset G of either Pp or P; with G’ <; X. We force with 
2-fold Mathias conditions (Eo, £1, /) such that E; C P; for each i < 2 and such that 
T is low. A sufficiently generic filter thus yields limit homogeneous sets Go and G 
for c with colors 0 and 1, respectively. We use Go and G; as names for these objects 
in our forcing language, though these are just abbreviations for definitions using the 
parameter G, as in Remark 8.3.6. 

The two key density facts are the following. 


Claim 1; For each n € w and each i < 2, the set of conditions forcing (Ax > n)|x € 
G;| is uniformly X-effectively dense. Fix n and i and a condition (£9, £1, /). Since 
P\-; has no infinite low subset, it must be that 7 P; is infinite. In particular, we 
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can [-computably (and hence certainly X-computably) find an x > nin IN P;. Let 
E; = E; U {x}, Ey_, = E:-i, and J* = {y € I: y > x}. Then (£5, Ej, /*) is an 
extension of (Eo, £1, /) forcing x € G;. 


Claim 2: For all eg, e, € w, the set of conditions that decide, for some i < 2, whether 
e; € G is uniformly X-effectively dense. Consider the class C of all sets Z = Zp © Z; 
as follows. 


© ZUZ, =. 
¢ For eachi < 2 and every F € Z,;, OF (e;) ft. 


Then C isa T1?(/) class. Clearly, J can uniformly compute a tree T € 2“ such that 
C = [T]. Since J is low, this means that @’ can uniformly compute whether or not 
C # @, since this is the same as whether or not T is infinite. Suppose first that C = ©. 
Then in particular, (JN P9) ®@ INP) € C. Since IN Po) ®@ UN P|) <7 @’, it follows 
that @’ (and hence X) can uniformly search for an i < 2 and a finite set F C I'M P; 
such that a a (e;) |, and this search must succeed. For the least i such that some 
such F is found, let E; = E; U F, let Ej_,; = E\-i, and let * = {ye l:y> F}. 
Then (£5, £},/*) is an extension of (Eo, F1, /) forcing of (e;) |. Suppose next that 
C # @. Since I is low, 9’ (and hence X) can uniformly compute an element Z € C 
such that (I ® Z)’ <y I’ =, @’. Thus, each of the statements “Zo is infinite” and “Z 
is infinite” are uniformly 1(2’), and one of them is true. So by Theorem 2.8.25 (4), 
relativized to @’, X can uniformly computably find ani < 2 such that Z; is infinite. 
In this case, we let (E>, EN; I*) = (Eo, E1, Z;), which is an extension of (Eo, £1, /) 
forcing oe (e;) T. 

Now apply Theorem 7.5.6 to construct an X-computable sequence of conditions 
(Ee Et 1°) > (Eo, E| E!,I') > +--+ such that the following hold for all s: if s = 3n +i, 
i < 2, then (Ep, Ey,0° IF (ax > n)[x € G;]; if s = 3(e0, e1) + 2, then (E5, E?, I*) 
decides, for some i < 2, whether e; € Gi. 

It is clear that for each i < 2, G; = U, E? is an infinite subset of P;. 

Now by Lachlan’s disjunction (Lemma 8.3.7), there is an i < 2 such that for all 
e € w there is an s such that (£5, E}, 1°) decides whether e € G;. Fix this 7. Since X 


knows a Sequence of conditions, given e, z can search for (and find) an s such that 


either ow, e;) | or, for all finite F C 1°, ®,; <i e;) T. In this way, X can conclude 
y: 


whether e € G; or not. Thus, G; <7 X. 
Finally, using Proposition 8.4.2, we thin G; to an infinite set H C G; which is 
computable from G; and homogeneous for c. oO 


We can now prove the main theorem. 


Proof (of Theorem 8.5.8). We wish to show that for every k > 2, every A € 2%, 
and every X > A’, every A-computable instance c of RT; has a solution H such 
that (A ® H)’ <z X. We proceed by induction on k. We prove the base case, 
k = 2, for A = @. The full case is obtained by relativization. Using density of > 
(Proposition 2.8.26), fix Xq such that X >> Xo >> @’. Fix a computable instance 
c of RT5. Define a computable instance R = (Ry : x € w) of COH by setting 


x ={y >x: c(x, y) = 0} forall x. By Lemma 8.5.11, there is an infinite R-cohesive 
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set G with G’ <7 Xo. As in Proposition 8.4.17, c }[G]? is stable. Since c [[G]* is 
G-computable and X > Xo 21 G’, it follows by Lemma 8.5.12, realtivized to G, 
that there is an infinite homogeneous set H C G for c [[G]* with (G @ H)’ <y X. 
Since H is also homogeneous for c, we are done. 

Now fix k > 2 and assume the result holds for k — 1. Again, we take A = 
®@ for simplicity. Fix a computable instance c of RTz. Using the density of > 
(Proposition 2.8.26), fix Xo such that X >> Xp > @’. Define a coloring co: [w]? > 
k — 1 as follows: for all X € [w]?, 


. eS if c(%) < k -2, 
co(x) = : 
k-—1 otherwise. 
Since Xo >> @’ and co is acomputable instance of AI? we may apply the inductive 
hypothesis to obtain an infinite homogeneous set Ho for co such that Hj <7 Xo. If 
Ho has color i < k — 2 for co, then Ho is also homogeneous for c and so we can 
take H = Ho and have H’ <y Xo <7 X. Otherwise, c(x) is either k — 2 or k — 1 
for all x € Ho. In this case, define a coloring c,: [Ho]? — 2. as follows: for all 
x € [Ao]?, c1(x) = c(x) — k +2. Since X > H; and c; is an Ho-computable 
instance of Ale, we may apply the inductive hypothesis, relativized to Ho, to obtain 
an infinite homogeneous set H € Ho for c; such that (Hp @ H)’ <p X. Now H is 
also homogeneous for c and we have H’ <r X, as desired. oO 


8.6 Liu’s theorem 


Having seen that Ramsey’s theorem for pairs is strictly weaker than arithmetical 
comprehension, the next logical point of comparison is with weak K6nig’s lemma. 
For starters, we can already show that WKLo does not prove RT5. Indeed, there is an 
w-model satisfying WKL contained entirely in the low sets, but by Theorem 8.1.1, 
RT3 has a computable instance with no AS (let alone low) solution. Thus, RT5 kw 
WKL. What about in the other direction, does RCAy | RT? — WKL? Following 
Seetapun’s proof this question quickly gained prominence as a major problem in 
computable combinatorics, particularly as more and more attempts at solving it 
proved unsuccessful. A correct proof was finally found in 2011, by Liu. 


Theorem 8.6.1 (Liu). RT? admits PA avoidance. 


Corollary 8.6.2. WKL <,, RT3. Hence also RCAg ¥ RT? > WKL (and so the two 
principles are incomparable over RCAg.) 


The story here bears an almost uncanny resemblance to the resolution of another 
longstanding open problem in computability theorem—Post’s problem, by Friedberg 
and Muchnik—in that the breakthrough did not come from a senior researcher, long 
established in the field, but rather a newcomer. Indeed, like Friedberg and Muchnik, 
Liu was still an undergraduate student at the time his proof was published. And 


232 8 Ramsey’s theorem 


what a proof! As we will see, it has many novel elements that really set it apart 
from the arguments we have seen up to this point. At the same time, the proof still 
employs (a variant of) Mathias forcing and follows the Cholak—-Jockusch—Slaman 
decomposition, so we will not be wholly on unfamiliar ground. 

The proof breaks into two halves, following the general outline of the proof of 
Seetapun’s theorem that we saw in Section 8.5.1. We will utilize the notion of strong 
PA avoidance from Definition 3.6.13. 


Proposition 8.6.3. COH admits PA avoidance. 
Proposition 8.6.4. RT} admits strong PA avoidance. 


We have already proved the former result in Proposition 3.8.5, where we dealt 
with COH in the guise of SeqCompact,... (Recall that COH =, SeqCompact,., 
by Proposition 4.1.7.) Thus, the remainder of this section is dedicated to proving 
Proposition 8.6.4. 


8.6.1 Preliminaries 


To prove Proposition 8.6.4, we must show that for all sets A, C with A + C, every 
instance c: w > 2 of RT; (computable from A or not) has a solution G such that 
A®G + C. Forease of notation, we work with sets and subsets rather than colorings, 
per Proposition 8.4.4. That is, we show that every set P has an infinite subset G in it 
or its complement such that A @ G + C. (Formally, P = {x : c(x) = 0}, but we will 
not mention c again.) We may assume that P is not computable, as otherwise either 
P or P is infinite and can serve as G. As a result, we can take A = C = @. The only 
property of @ we will invoke is that @ > @, and hence that it does not compute P. 
Thus, the full result easily follows by relativization. 


Definition 8.6.5. 


1. A pre-condition is a sequence p = (Eo,...,En-1,C) where the EF; are finite 
sets, called the finite parts of the pre-condition, and C is a m1 class such that 
E; < X; for every i < n and every Xp ®:-: ®@ Xy_1 € C. We call n the length of 
the pre-condition. 

2. A finite part E; of the pre-condition (Eo,...,E,-1,C) is acceptable if there 
exists Xp ®--- X,_; € C such that X; N A and X;N P are both infinite. 

3. We call (Eo,..., En-1,C) a condition if for some N € w, called the bound of 
the condition, whenever Xo ®- - - @ X,_; belongs to C then every x > N belongs 
to Uien Xj. _ 

4. A pre-condition (Eo, aces ts C) extends (Eo,..., En-1,C) ifm > nand there 
exists a surjective mapc: m — nsuchthatforalli < mandall Yo@---®Yin-1 € C 
there exists Xp ® --- @ X,_1 € C such that E.(j) C E; © Ee) U Xeq). In this 
case, we say that c witnesses this extension, and that E; is a child of E,j). 
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5. We call (Eo, ee EO) a finite extension of (Eo,...,En-1,C) if m = n, 
the witness c is the identity, and for every Yo @--- ® Yy-1 € C there exists 
Xo ®---@® Xp,_-1 € C such that Y; =* X; for alli <n. 

6. Aset G satisfies (Eo,...,En-1,C) viai < nif there exists some i < n and some 
Xo ®---® X,_; € C such that G satisfies (£;, X;). In this case, we also say that 
G satisfies (Eo, ..., En—1,C) via i. 


We wish to construct a set G meeting the requirements 
P.: GAP and GN P each contain an element > e 


for all e € w, as well as 
Reye, 2 OF” or OE” is not a 2-valued DNC function 


for all e9,e,; € w. Such a G will thus have infinite intersection with both P and P, 
and by Lachlan’s disjunction (Lemma 8.3.7), either one will not be a 2-valued 


DNC function for all eg, or oorP will not be a 2-valued DNC function for all e;. 
The proof of the proposition will follow from the following two lemmas. 


Lemma 8.6.6. Every condition p has an acceptable part. Thus, given e € w, p has 
a finite extension q such that if E is a child of an acceptable part of p then E 1 P 
and E ( P both contain an element > e. 


Lemma 8.6.7. Let p be a condition, and let e9,e, € w be given. There exists an 
extension q of p such that for all G satisfying q, either oo is not a 2-valued DNC 


function or oo is not a 2-valued DNC function. 


Proof (of Proposition 8.6.4). First, we inductively define a sequence of conditions 


P0>P1>->- 


with p;+; extending p; for all i € w. We begin with the condition po = (@, {w}), 
and assume we are given p; for some i € w. If i = 2e, let p;,; be the finite extension 
obtained by applying Lemma 8.6.6 to p;. If i = 2(e9, e1) +1, let pj4; be the extension 
obtained by applying Lemma 8.6.7 to ps. 

Now let T be the set of all the acceptable parts of the conditions p,;, ordered by 
setting E < E for E,E € T just in case there exist ign < +--+ < ig_; and Eo,..., Ex-1 
such that Ey = E, Ex_; = E, each FE; is an acceptable part of Pij> and each Ej, is a 
child of E';. Under this ordering, T forms a finitely branching tree, because if E isan 
acceptable part of some condition and a child of a finite part E of another condition, 
then E is also an acceptable part. Furthermore, T is infinite, since every p; has an 
acceptable part by Lemma 8.6.6. 

So let {Go, G1,...} with Go < G, <--- be any infinite path through this tree, 
and set G = Uje. Gi. It is then readily seen that G satisfies p2(¢),¢,)41 for all 
€o,€1 € w, and hence that it meets R.,,¢,. Similarly, G satisfies pz. for all e, and 
hence meets Pe. oO 
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8.6.2 Proof of Lemma 8.6.6 


We require one preliminary lemma. 


Lemma 8.6.8. Let p = (Eo,...,En-1,C) be a condition, and let e € w be given. 
There exists Xyp®-+:-®Xyn-1 € C,i <n, andx,y > ewithx € X;NP andy € X;NP. 


Proof. Suppose not. Since C is a m1 class and P is noncomputable, we may choose 
some Xp ®-:: ® X,-1 € C that does not compute P. By assumption, for each 
i < neither X; C* P or X; C* P. Let S = {i < n: X; <* P}, noting that 
{i<n:i¢ So} ={i <n: X; C* P}. Since p is a condition, we have );-, Xi =" w. 
So, given x larger than e and the bound of p, we have that x € P if and only if x € X; 
for some i € S, meaning P <r Xp © -:: ® Xp-1, a contradiction. oO 


Now the proof of Lemma 8.6.6 follows. 


Proof (of Lemma 8.6.6). Fix a condition p = (Eo,...,En-1,C) and ane € w. 
For each s € w, we define finite sets Bes = ve en such that the class C, of all 
Xo ®-:-@®Xp_-1 € C with 


FF, #@— X; | max F; + 1 = F; 


for alli < n is nonempty. Let F? Ses Fo = ©, so that Cp = C # @, and assume 
inductively that Fy,...,/"*_, have been defined for some s. Then (Eo, .--,En-1,Cs) 
is a condition, so applying the previous lemma with 


e = max(E U Fo) U-+++U (En-1 U Fe_y)s 


we may fix Xp ®--- @ Xp_1 € Cy,i <n, and x € X;N Pand y € X; MP. Let 


rn fs if j #i 
J Feu{ze Xi:z< x,y} if j =i, 


noting that X; [ max ale = ie by definition, so Xp ®@-+- ® Xyn-1 € Cs41 # OD. 
If we now let F; = U, F? for each i, then Fo 6 --- © F,-1 must belong to C by 
compactness. Furthermore, for all s and i, if F’ ae # F? then F a — F? intersects 
both P and P. Since at each stage F (a # F: - for some i, there must be some i for 
which this is the case at infinitely many s. Then F; has infinite intersection with both 
P and P, meaning £; is acceptable. Oo 


8.6.3 Proof of Lemma 8.6.7 


We first require a crucial definition. 


Definition 8.6.9. Let p = (Eo,...,En—-1,C) be a condition, and let e9,e; € w be 
given. 
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1. A bit assignment is a finite partial function v: w — 2. A bit assignment is 
correct if dom(v) # @ and ®,(x) |= v(x) for all x € dom(v). 
2. Given a bit assignment v, let S, be the mm? class of all sets of the form 


Xo0,0 © X0,1 ® ++ ® Xn-1,0 ® Xn-1,1 


such that (X09 U X01) ® +++ ® (Xy-1,.0 U Xn-1,1) € C and for all i < n and all 
finite sets F, 
* if F satisfies (E; 9 P, X;,9) and oF (x) Je {0, 1} for some x € dom(v), then 
OF (x) # v(x), 
* if F satisfies (Ej; N P, X;,,) and oF (x) Je {0, 1} for some x € dom(v), then 
Of, (x) # v(x). 
3. We say p forces agreement on eg, e, if there exists a correct bit assignment v 


such that S, = @. 
4. Given bit assignments vo, ... , Us—1, let Cross(S,,,...,Sy,_,) denote the class of 


all sets of the form 
DD OD xX, 


i<n j<2 k<l<s 


where, for each k < s, 


k k k k 
X00 ® X91 BBX 19 BX © Sux. 

5. For a set J of indices i < n, we let S,,7 be defined as S$, above but with the 
definition applying only to those i € J. We say p forces agreement on eg, e; 
inside I if there is a correct bit assignment v such that S,,7 = @, and define 
Cross(Sw,7,--->Sv,;,1) aS above but with S,, replaced by S,,.,7. 


Lemma 8.6.10. Let p = (Eo, ..., En-1,C) be a condition, and let vo, . . . , von be bit 
assignments. Then 


(Eo,...,Fo,---,En-1,---,En-1, Cross(Sy,..-,Sy,,))s 


where each E; appears ae) times is a condition extending p with the same bound. 
For any set I of indices i <n, the same result holds if Cross(Sy, .. . , Sy, ) is replaced 


by Cross(Sy,15-+++Svsn,1)- 
Proof. It is not difficult to see that 


(Eo,...,Eo,..-,En-1,---,En-1, Cross(Sy,---,So,,)) 


is a pre-condition extending p, as witnessed by c: 2n(*S'!) — n given by c(2i(7""')+ 


j) =i for each i < nand j < 25 |; To see that is in fact a condition, let N be the 


bound of p, and fix any x > N along with 


k k k k 
Xo,9 X19 O° BXo y_-| @ Xp py € Sy 
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: k k k k 
for each k < 2n + 1. Since (Xoo U Xt 0) ®:::@® (Xon-1 U Xi aD € C for each 
such k, there exists some ix,, < n and jx,x < 2 such that x ¢€ Xi lex’ But 
since there are only 2n many pairs (jx,x,4x,x), there must exist k # / such that 


(Jk,x>tk,x) = Gi,x. t,x). Hence, x € xe Aya xi 2 and thus 


xodk, wok, 


xeDQD B Xf, OX} , € Cross(Sip, +... Sunn): 


i<n j<2 k<l<2n+1 


Lemma 8.6.11. Let p = (Eo,...,Ey-1,C) be a condition, and let eg,e, € w be 
given. If p does not force agreement on @9,e, then there exist bit assignments 
Up,.-+5U2n Such that Sy, # © for eachi < 2n+1, and for eachi < j < 2n+1 there 
is some x € dom(v;) N dom(v;) with v;(x) # v;(x). For any set I of indices i <n, 
the same result holds if forcing agreement is replaced by forcing agreement inside I 
and Sy, is replaced by Sy,.1- 


Proof. We construct, for each o € 2“, an element x= € w with certain properties 
described below. Given a bit assignment v not defined on x, ;; for any i < |c|, let 


vo =0U [J {ot od) }- 


i<|o| 


Each x, will satisfy that ©, (x,-) does not converge to 0 or 1, and if uv is any 
correct bit assignment not defined on x, ;; for any i then each of S,,U,(x,,0)} and 
Sy U{(x,,1)} is nonempty. Let y be any number such that ®,(y) le {0, 1}. 

To begin, we claim that there must exist an x € w such that ®,(x) does not 
converge to O or 1, and for each correct bit assignment v not defined on x, each of 
Syuf(x,0)} and Syu(x,1)} is nonempty. If not, we could compute a 2-valued DNC 
function f as follows: given x, we wait until either ®,(x) le {0,1}, in which case 
we let f(x) = 1 — ®,(4), or until we find a correct bit assignment v not defined on 
x anda j € {0,1} such that the m1 class Syu,(x,j;)} 18 empty, in which case we let 
f(x) = j. (Note that if ®, (x) converged and was equal to j then v U {(x, 7)} would 
be acorrect bit assignment, so S,u,(x,;)} could not be empty by assumption.) In fact, 
the same argument shows that there must be infinitely many x with this property, so 
let xq be any such x > y. 

Now suppose x, has been defined for some ao € 2“, and fix k € {0,1}. We 
claim that there must exist an x € w such that ®,.(x) does not converge to 0 or 1, and 
for each correct bit assignment v not defined on x or on xg}; for any i < |o|, each 
Of Sy Ufixg.k).(x,0)} and Sy, UL(x,,k),(x,1)} is nonempty. If not, we could compute a 
2-valued DNC function f as follows: given x, we wait until either ©, (x) Je {0, 1}, 
in which case we let f(x) = 1 — ®,(x), or until we find a correct bit assignment 
v not defined on x or on x; for any i and a j € {0,1} such that the my class 
Su,U{(xgk),(x,7)} 18 empty, in which case we let f(x) = j. (Note that if ®,(x) 
converged and was equal to j then w = vU {(x, j)} would be a correct bit assignment 
not defined on x, ;; for any i, so 


SuU{(xosk)(x.j)} = SwoU{(xosk)} 
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could not be empty by choice of x,.) The same argument shows that there must be 
infinitely many x with this property, so let x, be any such x > y. 

Having defined x, for all o, let v be any correct bit assignment not defined on any 
of them (which exists since we chose all x,, to be larger than y, so v = {(y, ®y(y))} 
would work). Let oo, ..., 2, be any pairwise incompatible elements of 2* of the 
same length, and for i < 2n + 1, let v; = vg,. By construction, each S,, is nonempty. 
Furthermore, given i < j < 2n +1, if t denotes the longest common initial segment 
of o; and o7;, then 

vi(xr) = of (It|) # of (ItI) = vj (Xr), 
as desired. oO 


Given a condition p, let J, be the set of all i < n such that there exists a G 
satisfying (Eo,...,En-1,C) via i and either oot or oon? is a 2-valued DNC 
function. Lemma 8.6.7 is an immediate consequence of the following: 

Lemma 8.6.12. Let p be a condition so that I, # @, and let eg,e, € w be given. 


If p forces agreement on é0, e, inside I, then p has a finite extension q such that 
[Iqg| < |Ip|. Otherwise, p has an extension q such that Ig = ©. 


Proof. Let p = (Eo,...,En-1,C), and suppose first that p forces agreement on 
€, €1 inside J,,. Then there exists a correct bit assignment v such that Sov,I, = ©. 
Fixing any Xp ®--- ® X,_1 € C, we then have that 


(Xo N P) ® (XoN P) @--+ ® (Xn — 1 NP) @ (X, -1N P) 


does not belong to Sy, Ip: Hence, for some i € J,, there either exists a finite set F 
satisfying (E; MN P, X; Q P) such that oF agrees with v on some x € dom(v), or else 
there exists a finite set F satisfying (E; A P, X; M P) such that oF agrees with v on 
some x € dom(v). Say the former possibility holds, the latter being analogous. We 
then fix some such F and x and define 


Ej= ao i i 
F ifj=i, 
and let Q consist of all sets of the form 


Bretvevi:y>ehipe By; 

j<i i<j<n 
where Yo © --- ® Y,-; € C. Now for any G satisfying (Fo,...,Ey,-1,Q) via i, 
F CGQOP so if io is total and {0,1}-valued then it equals ®,(x) and is 
therefore not 2-valued DNC. Thus, i ¢ J,. Since clearly I, C J,, this implies that 


ql < pl. 
Now suppose p does not force agreement on eg, e; inside J,. Then, using 
Lemma 8.6.11, fix bit assignments vo, . . . , v2, such that So; 1p # © foreachi < 2n+l1, 


and for each i < j < 2n+1 there is some x € dom(v;) Adom(vu;) with u;(x) # v;(x). 
Let q be 
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(Eo,...,E0,.--,En-1,---, En-1, Cross(Sy,.--,Sv,_,)), 


where each E; appears 2) times, which by Lemma 8.6.10 is an extension of p. 
Let c: v5) ali — n witness this extension. Now consider any set G satisfying q, 


say viai < n. Then G satisfies (Eoa, Xk j xe .) for some k, 1 < 2n+1,j < 2, 
k k k 
pig Xp, © © X19 Xp © Soest 
and 
l l l 
Xp,9 ® XO OX 19 OX 11 © Syty- 


Fix x so that vg(x) # vj (x), and suppose j = 0 for simplicity, the case where j = 1 
being analogous. Then if ®Y"? (x) Le {0, 1} we have BY"? (x) # vg (x) since GAP 
satisfies (E.(i)  P, x c(i), 9)» and so oo = u(x) since vg and vy; are {0, 1}- 
valued. But more (x) # uj(x) since GN P satisfies (E,(i) N P, x o)> Which is a 


contradiction. Thus, OOP is not total. We conclude I, = @, as desired. oO 


Proof (of Lemma 8.6.7). If I, = @, there is nothing to prove. Otherwise, let po = p, 
and having defined p, for some s € w with J,, # @, let ps4; be an extension of 
Ps with |Ip,,,| < [Zp,|, a8 given by the previous lemma. By that lemma, there exists 
some s such that J,, = ©, and we let this be q. Oo 


8.7 The first order part of RT 


We now turn briefly to the first order strength of Ramsey’s theorem. For n = | and 
n > 3, we already a near complete picture. RCAg + RT; for each specific k, so its 
first part lies below Ix? by Corollary 5.10.2 (and so is trivial, from our perspective). 
RT! is equivalent to B Bx) by Hirst’s theorem (Theorem 6.5.1), and so this is also its 
first order part. Forn > 3, RT”, and also RT; for each specific k > 2, is equivalent to 
ACAo, So its first order part is just PA by Haralhany 5.10.7. For completeness, we can 
also note that RCAg + (Vn)RT", so here too the first order part is clear. As with the 
second order strength studied above, then, the interesting (and more complicated) 
case is n = 2. Unsurprisingly, the first order part of Ramsey’s theorem for pairs, like 
its second order part, has been the subject of a great deal of research. 


8.7.1 Two versus arbitrarily many colors 


On the technical side, our main goal for this chapter is to present a proof that the first 
order part of RTS is strictly weaker than that of RT” (even SRT”). Thus, in particular, 
RTS and RT? are not equivalent over RCAg. The two key theorems are the following. 
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Theorem 8.7.1 (Cholak, Jockusch, and Slaman [33]). WKLo + RT} + IZ? is Tj 
conservative over WKLo + 1x0. 


Theorem 8.7.2 (Cholak, Jockusch, and Slaman [33]). RCAj + SRT? > BE. 


Since Ba, is strictly stronger than i= over RCAg (and hence over WKLo, by Har- 
rington’s theorem, Theorem 7.7.3), the following corollary is immediate. 


Corollary 8.7.3. For each k > 1, RCAg ¥ RT; — SRT?. 


The proof of Theorem 8.7.1 utilizes the Cholak-Jockusch—Slaman decomposition, 
via the following two propositions. 


Proposition 8.7.4 (Cholak, Jockusch, and Slaman [33]). Every countable model 
of WKLo + xo is an w-submodel of a model of WKLo + COH + iz. 


Proposition 8.7.5 (Cholak, Jockusch, and Slaman [33]). Every countable model 
of WKLo + xg is an w-submodel of a model of WKLo + SRT3 + Da 


With these in hand, Theorem 8.7.1 easily follows. 


Proof (of Theorem 8.7.1). By Theorem 5.9.4, it suffices to show that every model of 
RCAo+!23 is an w-submodel of RCAg+RT3 +129. Fix acountable model M satisfying 
RCAo + [xe and suppose c € S™ is a coloring [M]* — 2. By Corollary 5.9.6, it 
suffices to produce a set G such that M[G] satisfies D4 and the fact that G is an 
infinite homogeneous set for c. We proceed as in Lemma 8.5.3, defining an instance 
R of COH in S™ such that c [[X]? is stable for every solution X to this instance. 
Apply Proposition 8.7.4 to find M* 3,, M satisfying RCAp + COH + Iza. Since 
ReS"c S™" we can choose a COH-solution X to RinS™, Thus, c [[X]? is 
an instance of SRT3 in S™’. Apply Proposition 8.7.5 to find M** D,, M* satisfying 
RCAo+SRT5+ Da Then c [[X]* € S™” so we can choose an SRT3-solution GCxXx 
to c }[X]?. Since M* and M have the same first order part, it follows that M[G] 
satisfies poe Obviously, G is an RT3-solution to c in M[G], so we are done. Oo 


We delay the proofs of Propositions 8.7.4 and 8.7.5 to the next sections. For now, 
we turn to Theorem 8.7.2. The proof has some elements in common with that of 
Hirst’s theorem, but in other ways it is quite different. 


Proof (of Theorem 8.7.2). We argue in RCAg + SRT” and derive BITS (which is 
equivalent to Bx by Theorem 6.1.3). By Theorem 8.4.9 (or Exercise 8.10.3), we 


may assume BIT? . Seeking a contradiction, suppose BIT) is false. Thus there is a TI 
formula y(x, y) and a z such that the following two facts are true. 


1. (Vx < z)(Ay) g(x, y). 
2. (Vw)(Ax < z)(Vy < w)ny(x, y). 
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For ease of terminology, say w is bad for x < zif (Vy < w)7(x, y). Thus, (2) says 
that every w is bad for some x < z. 

Let w(u,v,x,y) be a ya formula such that g(x,y) © (Vu)(Su)wW(u, v, x, y). 
Define a map r: [N]* x z > N as follows: given w < t and x < z, let r(w,t,x) be 
the largest number u* < t such that 


(Ay < w)(Vu < u*)(Av < thw(u, v, x, y). 


Since u* can, in principle, be 0, it follows that r is total. Next, define a coloring 
c: [N]? > z, as follows: given w < 1, let c(w,t) be the least x < z such that 
r(w,t,x) < r(w,t,x*) for all x* < z. 

The key properties about r are the following. Fix w and x < z, and suppose first 
that w is bad for x. Thus, (Vy < w)(du)(Vv)>W(u, v, x, y). By BIT’, there is aw such 
that (Vy < w)(au < “)(Vv)-=W(u, v, x, y), and by Lm? we may assume i is least 
with this property. Let u* = a — 1. By minimality of u, we have (Ay < w)(Vu < 
u*)(Av)w(u, v, x, y), so by BIT’ we may choose an s such that (Ay < w)(VWu < 
u*)(Av < s)w(u,v,x, y). Then for all t > s we have r(w,t,x) = u*. In particular, 
lim, r(w, t, x) exists. 

Now suppose w is not bad for x, so that (Ay < w)(Vu) (Av) w(u, v, x, y). Fix any 
witness y < w. Let m be arbitrary, and apply BI? and Li to find a the least s such 
that (Vu < m)(Au < s)w(u,v,x, y). Then for all t > s, we have r(w, t,x) > m. We 
conclude that lim; r(w, t,x) = co. 

We now combine these two facts to prove that c is stable. Fix w, along with any 
r* such that lim; r(w, t,x) = r* for some x < z. (This exists, since w is bad for at 
least one x < z.) Now consider the sentence 


(Vx < z)(As)(Vt > s)[r(u,t, x) =r(w,s,x) V r(w,t,x) > r*]. 


By our remarks above about r, this sentence is true. Hence, by BIT, we may fix an 
So so that 


(Vx < z)(As < so)(Wt > s)[r(w, t,x) = r(w, s,x) V r(w,t,x) >r*). 


So for every x < z, either w is bad for x and r(w, t, x) reaches its limit by so, or this 
limit is always larger than r* past so (irrespective of whether w is bad for x or not). 
By choice of r*, it follows that it f > so then c(w,t) = x for some x satisfying the 
first alternative, and hence that c(w, ft) is the same (namely, c(w, t) = x for the least 
x that minimizes lim, r(w, ft, x)). 

To complete the proof we claim that for each x < z there are at most finitely 
may w € N with lim, c(w,t) = x. Clearly, this is impossible in the presence of 
SRT’, so this yields the desired contradiction. Fix x. By (1), there is y such that 
(Vu) (Av) W(u, v, x, y). Fix any w > y. Then w is not bad for x, hence lim; r(w, t,x) = 
oo. In particular, we cannot have lim; c(w, t) = x. The proof is complete. oO 
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8.7.2 Proof of Proposition 8.7.4 


Throughout this section we work over a fixed countable model M of WKL + Ize. By 
Theorem 7.7.5, every countable model of RCAg+!=9 is an w-submodel of a countable 
model of WKLo + i=, So it suffices to show that for a given instance Re Sof 


COH there is a G such that M[G] § [xe +“G is an infinite R-cohesive set”. 

We will add G by an elaboration on Mathias conditions in M, in the sense of 
Definition 7.6.1. Recall that Mathias conditions in this context are pairs (E, J), where 
E,1é€S™, E is M-finite, J is M-infinite, and E <™ J. Since M is fixed, we omit 
saying “Mathias condition in M”, “forces in M”, etc., and say simply “Mathias 
condition’, “forces”, etc. 

We will need the following definition. 


Definition 8.7.6. A negation set is an M-finite collection T of m1 formulas of Lo 
of the form W(X, x), i.e., having one free set variable and one or more free number 
variables. 


Here and throughout, we abbreviate d being an M-finite tuple of elements of M by 
d € M. The idea of the proof is the following. We construct a Mathias generic set 
G, and satisfy xe in M[G] by considering each p34 formula of £2 in turn. Such 
a formula may have G as a parameter, and so has the form (Ax)W(G, x, y), where 
W(X,x, y) isa m1 formula with parameters from M. In general, there are two ways 
induction can hold in M[G] for a formula ¢(y): 


* (Vy)y(y) holds in the model. 
¢ There is a b € M such that (Vy < b)y(y) A ay(b) holds. 


Thus, we should like to show that for every m1? formula w(X,x, y) and every b, 
it is dense either to force (Ax)W(G, x, b) or to force a(Ax)W(G, x, b). However, as 
discussed in Remark 7.7.2, we need the complexity of forcing these statement to 
match that of the formulas themselves. The problem is that forcing the negation of a 
ps statement is too complicated (it is not 11). This is where negation sets come in. If 
we cannot force (Ax)w(G, x, b) for some b, we add w(X,x, b) toa running negation 
set. We then argue that for every ae M, it is dense to force =w(G, d, b), thereby 
ensuring that =(Ax)w(G, x, b) holds in M[G]. 

Let us pass to the details. Given a negation set I’, we label each Mathias condition 
as either T’-large or T’-small, in a way that will be defined below. The specifics of 
this are not needed to state the key lemmas and derive Proposition 8.7.4 from them, 
so we do that first. The following technical lemma spells out the key properties of 
largeness and smallness that we will need. 


Lemma 8.7.7. Let Tbe a negation set. 


1. If (E, 1) and (E*, I*) are Mathias conditions, (E, 1) is V-large, and (E*, I*) isa 
finite extension of (E, 1) (i.e., (E*,I*) < (E, 1) and I\ I’ is finite), then (E*, I*) 
is ’-large. 
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2. If (E, I) is a V-large Mathias condition and (X,x) € T, then (E,1) does not 
force W(G, d) for any d € M. 

3. If (E, 1) is aY-large Mathias condition, a € M, and for each j < a, E; and I; 
are sets in S™ such that E; is M-finite, E C E; C E; Ul, andI= Ore I, 
then there is a j < a such that (E;,1;) is l-large. 


The argument will be organized using an auxiliary notion of forcing that pairs 
Mathias conditions with negation sets. 


Definition 8.7.8. We define the following notion of forcing. 


1. The conditions are triples (T, E, J), where I is a negation set and (E,/) is a 
T-large Mathias condition. 

2. Extension is defined by (I*, E*, /*) < (1, E, 1) if f ¢ I and (E”, /*) extends 
(E, I) as Mathias conditions. 

3. The valuation of (IT, FE, /) is the valuation of (F, 1) as a Mathias condition. 


As we will see below, the set of conditions here is nonempty. Because the valuation 
map ignores the negation set, a generic for this forcing is thus also a Mathias generic 
(but not conversely). By the same token, (I’, F, /) forces a formula if and only if the 
Mathias condition (E, J) forces it. 

The following lemmas help make formal our discussion above concerning pre- 
serving bay in M[G]. We will use the following notation for convenience. Given 
d € Mwithd = {do,...,da-1) for some a € M, and a formula W(X, x) with 
x = (x0,...,Xn_1) for some n € w, then writing y(X,d), means that a >™ n and 
refers to the formula where, for each i < n, x; has been replaced by d;. 


Lemma 8.7.9. Let &(X,x, y) bea mm? formula of £2. For every a € M, the set of 
conditions (T, E, 1) satisfying one of the following two properties is dense. 


1. For every b <™ a, (E, 1) t™ (Ax)W(G, x, b). 
2. There is a b <™ a such that for every c <™ b, (E,1) t™ (Ax)W(G,x, c) and 
W(X,x,b) ET. 


Lemma 8.7.10. Let (T, E, 1) be a condition and y(X,x) € Y. For every d€ M, the 
set of conditions (I“, E*, I*) forcing =W(G, d) is dense below (T, E, 1). 


Proposition 8.7.4 now follows easily. 


Proof (of Proposition 8.7.4). Let G be sufficiently generic for the forcing in Defini- 
tion 8.7.8. To see that M & =, let W(X, x, y) bea my formula of £2 with parameters 
from M. By Lemma 8.7.9, either M[G] & (Ax)W(G,x, b) for every b € M, or 
there is a b € M such that M[G] — (Ax)W(G,x,c) for every c <™ b and, by 
Lemma 8.7.10, M[G] F =~W(G, d, b) for every tuple d of elements of M, meaning 
M([G] & a(Ax)W(G, x, c). Thus, induction holds. To see that G is cohesive for a 
given instance of COH in S™, fix any set R ¢ S™. By Lemma 8.7.7, if (I, E, J) isa 
condition then so is either (T, E, 1M R) or (1, E, 1 R). Hence, the set of conditions 
(I, E, 1) with J c* R or I c* R is dense, so either G C* R or G C* R. (Thus, G is 
actually simultaneously cohesive for every instance of COH in S™.) oO 
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So, it remains to prove the lemmas, and to this end we of course first need to make 
precise what it means for a condition to be I’-large or I’-small. 


Definition 8.7.11. Let C = {W;(X,x) : i < m} be a negation set. 
1. A Mathias condition (£, J) is T’-small if there exist 


°fEM, 
¢ M-finite sequences 
— (mj si <™ ¢), of elements of M, 
~(d;:i <™ €), of M-finite tuples of elements of M, 
—(E; i <M €), of M-finite subsets of M, 
each coded by an element of M, 
* an M-finite sequence (J; : i <™ £) of elements of S™, coded as an element 
of S™, 


such that the following hold in M: 


a. [= Uiee Ii, . 
b. for each i < €, E C E; € E UT and either J; is bounded by max d; or 
Wm; (Ei U F, d;) for every finite set F C Jj. 


2. A Mathias condition (£, /) that is not [-small is T’-large. 


Proof (of Lemma 8.7.7). We leave the proofs of (1) and (2) to the exercises (Exer- 
cise 8.10.9). 

For (3), fix I’. Let p(X, E,I,x) be the formula asserting: that there exist £ < x 
and (codes for) finite sequences (m; : i < ¢), (di :1 < €), and (E; : i < €) smaller 
than x; that X = (J; : i < €); and that clauses (a) and (b) in Definition 8.7.11 hold. 
Then a Mathias condition (£, /) is -small precisely if M & (Ax)(SX)y(X, E, I, x). 
Notice that clauses (a) and (b) are H°, and therefore so is y. By Exercise 5.13.16, 
there is a =p formula 6(X, E, /,x) such that in M, a set satisfies p(X, E, I, x) if and 
only if (Vk)0(X [ k, E,I,x). Given a € M, let T, be the set of all M-finite binary 
sequences o such that (Vk < |o|)6(o [ k, E,1,a). Then {T, : a € M} belongs to 
S™. Moreover, each T, is a binary tree and p(X, E, J, a) holds in M if and only if 
X is an M-infinite path through T,. Since M satisfies WKL, T has such a path if and 
only if T, is infinite. Thus, (AX) y(X, E, I, x) is equivalent in M to a m1 formula. 

Now, fix a I’-large Mathias condition (E,/),a € M,and sets E; and /; for j <M aq 
as in the statement of the lemma. We must show that there is a j such that (E,, /;) is 
a T-large Mathias condition. Seeking a contradiction, suppose that for each j <™ a, 
either [; is M-finite or (E;,7;) is T-small. Then M satisfies 


(Vj < a)(Ax)[(Vy) Ly € Tj > y <x] Vv (AX) 9(X, Ej, 0;,)]. 


By the discussion above, the matrix of this formula is equivalent in M to a nm 
formula. Since M satisfies Ix, it also satisfies BIT’. Thus, we may fix ab € M such 
that 
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(Vj < a)(Ax < b)[(Vy) Ly € lj my < x] V (AX) G(X, Ej, 7;,x)]. 
Let 6(z) be the formula 
(AX) (Vj < z)(Ax < b)[(Vy) Ly € Lj] ay <x) Vv g(XY, Ej, 7;,x)]. 
By Proposition 6.1.2, the formula 
(Vj < z)(Ax < b)[(Vy) Ly Ey ay <x] V (XY), E,, 1,x)] 


is equivalent in M toa my formula, and therefore so is 6(z). We may thus use 1° in 
M to prove that (Vz < a)6(z) holds. Trivially, M & (0). So fix a nonzero z <™ a, 
and assume M & 6(z — 1), as witnessed by X € S™. If J,_; is M-finite, we let 
X, = XU ({z- 1} x {1,_1}), so that ge = (/,_1), and then X, witnesses that 
M & 6(z). If I,-; is M-infinite, then by hypothesis we can fix X,-, € S™ such 
that (Ax < b)y(Xz-1, Fz-1, [z-1,x). If we let X¥, = X U ({z -— 1} x Xz_1), so that 
xe = X,_;, then X, witnesses that M F 6(z). 

To complete the proof, fix X, € S™ witnessing that M & @(a). By bounded mm 
comprehension (using Xq as a parameter), we can form the set of all pairs (j, x;) 
with j <™ a and Xj <™ b such that M & p(XU), E,,1;,x;). Now by LIT°, for 
each j <™ a we can fix an €; old x; and (codes for) sequences (mj; : i am €;), 
(dj.i <7 aM €;), and (Ej, : i aM €;) smaller than x; such that clauses (a) 
and (b) in Definition 8.7.11 hold with the sequence of sets coded by xl Let 
€ = Yj —Ma €;, and let _; = 0 for definiteness. Define a sequence (m, : k iia 
as follows: for k <™ &, fix J <™ aandi <™ €; such that k = ¢;-; +i, and let 
Ma = Mj,i- (So, mo = ™o0,0,1 = M0,1,---,Me = 1,0, M41 = M1,1,---5 etc.) 
Define (dx tk <™ €) and (Ex : k <™ €) analogously. Also, for each j <™ a we 
can decode x) as (Ij, 27 a €;), and so analogously define a sequence of sets 
(Ik 2k <“ a). As M satisfies BX2, an M-finite union of M-finite sets is M-finite, 
by Proposition 6.5.4. Thus, each of these sequences is M-finite. But now it is easy 
to see that €, (me: k <™ €), (dx hk <™ 2), (hp kk <™ &, and Uy sk <™ a) 
witness that (E, /) is I-small, which is a contradiction. im 


Proof (of Lemma 8.7.9). Fix a mM formula w(X,x,y), a € M, and a condition 
((, £, 1). We must find an extension (I*, E*, /*) satisfying either item (1) or (2) 
in the statement of the lemma. 

Let y(€, z) be the formula asserting that z < a and there exist (codes for) finite 
sequences (mj; : i <™ ), (dj of ef) ee Bo i eS 2), and 
(e; : i < €) such that one of the following holds. 


* Clauses (a) and (b) in Definition 8.7.11 hold with € and the sequences (m; : 
p<™ 2g), (d; i <™ €), (E; :i <™ ©), and Ui; : i <™ €) and clauses (a) and 
(b) hold. 

* For each i < ¢, e; = (e;,y € M: y < z) and W(£; U F, é;,y, y) holds for every 
y < zand every finite sets F C Jj. 
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Just as in the proof of the preceding lemma, ¢ is equivalent in M to a m1 formula. 
Thus, (4¢)¢(@, z) is Er We consider two cases. 


Case 1: M & (Ae) y(€, a). Fix a witnessing € and corresponding sequences (mm; : 
i<M eb) (di :i<™ 0), (E; 21 <" ©), (i i <™ O, and (e; si < €). Since (E, 1) 
is I-large, it follows by Lemma 8.7.7 (3) that (Zj, /;) is -large for some i <™ €¢. This 
means the first case in the definition of y above does not hold for 7, so the second must 
hold. Thus, (£;, /;) forces w(G, é;5, b), and therefore in particular, (Ax)w(G, x, b), 
for every b <™ a. In this case, we can therefore take (I*, E*, I*) = (1, Ej, Ij). 


Case 2: Otherwise. SoM § (Az < a)n(Ae)y(E, z). Since M satisfies D4 (and 
therefore LI), we can fix the least b < a such that M & 7(Ae)y(E, b). Since 
no € and no sequences witness that first case in the definition of y holds, this 
means in particular that (Z,/) is TU {w(X,x, b)}-large. If b = 0, then we can 
take (I*, E*, I*) = (TU {W(X,x, b)}, E, 1). So suppose b >™ 0. Fix € such that 
Mt (€, b — 1) and fix the corresponding sequences (m; : i <™ €), (d; ga"), 
(Ei i <™ €), Uy: i <™ ©), and (e; : i < €). By Lemma 8.7.7 (3), (Ej, Ij) is 
T-large for some i <™ @. As in the preceding case, it thus follows that (£;, J;) forces 
(Ax)w(G,x,c) for every c <™ b. Hence, (I*, E*, I*) = (TU {W(X, x, b)}, Ej, I) 
can serve as the desired extension. oO 


Proof (of Lemma 8.7.10). Fix a condition (IT, E,/), a formula y(X,x) in I, and 
dé M. Let (I, E*, I*) be any extension of (I, F, 7). We must exhibit an extension 
of this condition that forces =y(G, d). By Lemma 8.7.10 (2), (E*, J*) does not force 
W(G, d). Since w(X,x) is 11°, it is really s6(X, x) for some = formula @. Thus, the 
fact that (E*,1*) does not force w(G, d) means some (E**, 1**) < (E*,/*) forces 
A(G, d). By Exercise 7.8.10, this means that (E**, J* \ max E**) forces 6(G, d). By 
Proposition 7.4.2, (E**, /* \ max E**) also forces =76(G, d), which is =y(G, d). 
By Lemma 8.7.10 (1), the finite extension (£**, /* \ max E**) is I’*-large. Hence, 
(I*, E**, 7* \ max E™) is the desired extension. im 


8.7.3 Proof of Proposition 8.7.5 


We again fix a countable model M of WKL + 1=°, and this time also fix an 
instance c € S™ of SRT3. We show that there is a G such that M[G] & 
D4 + “G is an infinite limit homogeneous set for c”. The argument is very similar 
to that in the preceding question, but with modifications. We begin with a general- 
ization of Mathias forcing. Let us say here that a formula y(x) of Lo is MY in M if 
it is pa and equivalent in M to a Ty formula. 


Definition 8.7.12. Mathias pseudo-forcing is the following notion of forcing: 


* The conditions are pairs (E,y), where E ¢ S™ is M-finite and y is a AY 
formula in M such that M § (Vx)[y(x) — x > E] A (Wx)(Ay)[y > x A y(y)]. 
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¢ Extension is defined by (E*, y*) < (E,y) if Me E C E* A (Vx)[x € E* > 
x€ EV y(x)] A (¥x)[y"(x) > y@)]. 

¢ The valuation of (E, y) is the string 7 € 2<S™" of length the <-least element 
a € S™ such that M & y(a) (which exists, since M & b>) with M § 
(Vx)[o(x) =loxe E]. 


We will call the conditions in Mathias pseudo-forcing Mathias pseudo-conditions, 
although it should be noted that this is a bona fide forcing notion in M in the 
sense of Definition 7.6.1. Every Mathias condition is clearly a Mathias pseudo- 
condition. Conversely, a Mathias pseudo-condition (£,y) is a Mathias condition 
only if {a ¢€ S“ : M & y(a)} belongs to S™ (which it need not). For ease of 
notation, we write (EZ, “I’’) if (E,y) is a Mathias pseudo-condition and / is the set 
defined by y. We can then also write things like a € “I”, “I” C S for Se S™, etc., 
as shorthands for M & y(a), M & (Vx)[y(x) > x € S], ete. 

For each i < 2, let P; = {ae M: Mt lim, c(x, y) = i}, and note that Pp and P; 
are AS-definable in M by Remark 8.4.3. Say a Mathias pseudo-condition (E, “I’’) is 
i-acceptable if M & E ¢ “P;”. A generic G for forcing with i-acceptable Mathias 
conditions is thus a subset of P;, meaning a limit homogeneous set for c with color i. 
Below, we will first restrict to 0-acceptable Mathias conditions in an attempt to make 
G C Po, and failing that, we will switch to 1-acceptable Mathias conditions and 
build G C P;. We need to suitably adapt the definition of smallness and largeness to 
0-acceptable and 1-acceptable Mathias conditions and pseudo-conditions, and prove 
analogs of Lemmas 8.7.7, 8.7.9 and 8.7.10. We do this slightly differently in the 
0-acceptable and 1-acceptable case. 

First, we restrict to 0-acceptable Mathias conditions and pseudo-conditions and 
define smallness and largeness. 


Definition 8.7.13. Let I be a negation set. 


1. A Mathias condition (E,/) is I-smallo if it is 0-acceptable and T-small as 
in Definition 8.7.11, but with the modification that sequence (E; : i <™ @) 
additionally satisfies M & (Vi < €)[E; € “Po” ]. 

2. A Mathias pseudo-condition (E,“J’) is T’-smallg if it is 0-acceptable and [- 
small as defined for Mathias conditions, but with the following modification. 
Instead of an Mc-finite sequence (J; : i <™ €) of elements of S™ such that 
M © 1 = Ujec li, there is a A° formula y(i,x) such that, writing “J;” for 
{fae M: Me yli,a)}, we have M & “I” = Ujer “I”. The rest of the 
definition is then unchanged, except that J is replaced by “J” and J; by “J;” 
throughout. 


Thus, the only distinction between the definition for Mathias conditions and for 
Mathias pseudo-conditions is whether the partitions in the definition need to be 
actual sets in S™ or merely AQ-definable in M. Although Mathias conditions are 
pseudo-conditions, and so in principle could be I’-smallg in either sense above, it will 
always be clear which is meant. Whenever we specify a Mathias condition, I’-smallo 
is to be interpreted as (1), and whenever we specify a Mathias pseudo-condition, 
T-smallg is to be interpreted as (2). 
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Let us now consider Lemma 8.7.7. Part (1) is unproblematic and goes through with 
basically the same proof. The same is true of part (3), only we note that the additional 
clause above that M § (Vi < €)[E; © “Po”] is equivalent to a bay formula (and so 
does not increase the complexity of being [’-smallo). This is because BIT? holds in 
M, and therefore M & E; C “Po” <— (Aw)(Vx € E;)(Vy)[y > w > c(x, y) = O}. 
As to part (2), this is more subtle, and requires an additional hypothesis to go through 
here. 


For every negation set I’, whenever a Mathias condition (£, /) is T- 


smallp then so is the Mathias pseudo-condition (EF, 7M “Po”). (8:4) 


Lemma 8.7.14. Let P be anegation set and assume (8.4) holds. If (E, I) is aY-largeo 
Mathias condition and (X,x) € 1, then (E, I) does not force (G, d) (in the partial 
order of 0-acceptable Mathias conditions) for any d € M. 


Proof. Suppose towards a contradiction that (EZ, ) forces w(G, d). Since w is nt 
and we have restricted our forcing to just 0-acceptable Mathias conditions, it follows 
from the definition of forcing and Exercise 7.8.10 that W(E U F, d) holds for every 
M-finite set F ¢ J such that M & F C “Po”. But then it is readily seen that 
(E, 1 “Po”) is actually I’-small (by the same proof as in the original lemma). O 


We can now formulate an analog of the auxiliary forcing notion in Definition 8.7.8. 
Namely, conditions are triples (T', Z, 1), where I is a negation set and (E,/) is a 
T-largeg Mathias condition. Extension and valuation is defined as before. With 
this, Lemmas 8.7.9 and 8.7.10 now carry over with obvious modifications, using 
Lemma 8.7.14 in place of Lemma 8.7.7 (2) where needed. (It is worth emphasizing 
that these all involve only the auxiliary forcing notion, and hence ultimately only 
Mathias conditions. Mathias pseudo-conditions are only used in stating condition 
(8.4) and proving Lemma 8.7.14.) All the pieces can now be assembled as in the 
previous section. 


Proof (of Proposition 8.7.5 under the hypothesis (8.4)). We may assume c has no 
solution in M, as otherwise there is nothing to do. Thus, for every M-infinite set 
S ¢ S™ it must be that M & (Vx)(Ay)[y > x Ay € SN “Po”], as otherwise for 
some b € M, {a € S: a > b} would be an M-infinite subset of P) in S™) and 
hence a solution to c in M. It follows that for each a € M, the set of conditions 
(T, E, 1) such that M & (Ax)[x € E Ax > al] is dense, and a generic G for this 
forcing is consequently an M-infinite subset of Po. All that remains is to verify that 
M[G] satisfies [zy and this is done exactly as in the proof of Proposition 8.7.4 in the 
previous section, but using the analogs of Lemmas 8.7.7, 8.7.9 and 8.7.10 discussed 
above. oO 


To complete the proof, we must handle the case that (8.4) fails. Let us fix a 
counterexample, then, which is a negation set T and Mathias condition (E ; 1). Thus, 
(E ; 1) is T-largeo but the Mathias pseudo-condition (E : ™1 “Po”) is T-smallo. All 
definitions below will be made with respect to T and (E. ; 1). We start again with 
smallness, this time for 1-acceptable Mathias pseudo-conditions. 
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Definition 8.7.15. Let © > T be a negation set. 


1. A Mathias condition (£, J) is T’-small, if it is 1-acceptable, J 7, and (E, I) is 
T-small as in Definition 8.7.11, but with clause (b) modified to say that either 
I; is bounded by max di, or Wm, (Ei U F,d;) for every finite set F ¢€ Jj, or 
Wm, (E UF, di) for every finite set F C Jj. 

2. A Mathias pseudo-condition (£,“J’) is T’-small, if it is 1-acceptable, M & 
a see ae and (E,“I’’) is T-small; as defined for Mathias conditions, but with 
the following modification. Instead of an M-finite sequence (J; : i <™ €) of 
elements of S™ such that M & J = Ujep Jj, there is a AS formula y(i, x) such 
that, writing “/;” for {ae M: Me y(i,a)}, we have M & “I” = Use “Ti”. 
The rest of the definition is then unchanged, except that J is replaced by “J” and 
I; by “J,” throughout. 


Looking next at Lemma 8.7.7, parts (1) and (3) again lift without issue, just as 
on the O-acceptable side. In part (3), a Mathias condition being I-small, is x). 
definable by the same argument as for I’-smallo. Part (2) now carries over without 
any additional hypotheses. 


Lemma 8.7.16. If (E, 1) is a Y-large; Mathias condition and w(X,x) € T, then 
(E, I) does not force w(G, d) (in the partial order of \-acceptable Mathias conditions) 
for any de M. 


Proof. We show that if (Z, 7) is T-large, then so is the Mathias pseudo-condition 
(E,1. “P,”). The conclusion then follows just like in Lemma 8.7.14. Seeking a 
contradiction, suppose (E, 1M “P,”’) is T-small). Since I ¢ T and (E, ila “Py”) is 
T-smallo, it is readily seen from the definitions that (E, JN “Po”) is T-small;. Since 
Tl DF, this means that (E, IN“ Po”) is actually T-small;. Since (INP) UUNP}) = I, 
the idea now is to combine the witnesses for (E,/M “Po”) and (E,1.M “P,”) to 
conclude that (E, J) is T-small 1. We can easily fix a € M that bounds all the (codes 
for) sequences of elements of M, tuples of elements of M, and M-finite subsets of 
M that witness that (E, 19 “Po”) and (E, 1M “P,”) are T’-small,. However, for the 
witnessing partitions of reservoirs, this is more complicated. Since (E, 7 “Po’’) and 
(E,1M“P,’’) are Mathias pseudo-conditions, these partitions are given by formulas. 
By contrast, (E,/) is a Mathias condition, so a witnessing partition for it needs to 
be an actual element of S™. This can be handled as follows. As in the proof of 
Lemma 8.7.7 (3), there is a x formula 6(X,Y, Z,x) such that a Mathias condition 
(E*, 1*) is T-small; precisely if M & (Ax)(AX)(Vk)O(X [ k, E*, I*, x). Let T be the 
set of all M-finite binary sequences o such that (Vk < |o|)0(o | k, E,1,a). By 
choice of a, (E,/) will be I’-small; if we can show that there is an infinite path 
X through T. Since M & WKL, it suffices to show that for every b € M there is a 
o €T with |o| = b. Let & and ¢; be such that the witnessing partitions of JM Po 
and 1M P, have sizes f) and ¢), respectively, and let yp and y; be the formulas 
defining these partitions. Define o € 2? as follows: given (j,x) < b, if j <™ let 
o((j,x)) = 1 if and only if M & yo(j,x), and if f& <™ j let o((j,x)) = 1 if and 
only if M & y,(j — f, x). Then o is an initial segment of the characteristic function 
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of the partition of J obtained by merging the partitions (of 7M Po and I'M P;) defined 
by yo and yj. Clearly, o € T. oO 


Now, define, the auxiliary notion of forcing whose conditions are triples (T, E, I) 
such that [ D> I and (E, J) is al-large; Mathias condition. Lemmas 8.7.9 and 8.7.10 
then carry over, mutatis mutandis. 


Proof (of Proposition 8.7.5). Assume (I, E, 1) is a counterexample to (8.4). Again, 
we assume c has no solution in M, so that a generic G for our forcing notion is an 
M-infinite subset of P,;. M[G] satisfies by as before. Oo 


8.7.4 What else is known? 


The first order part of Ramsey’s theorem is now quite well understood. In fact, for the 
version with arbitrarily many colors (RT), it is known exactly. Cholak, Jockusch, 
and Slaman [33] established an analogue of Theorem 8.7.1 one level up in the 
arithmetical hierarchy. 


Theorem 8.7.17 (Cholak, Jockusch, and Slaman [33]). WKLo + RT + 1Z9 is TI} 
conservative over RCAg + ba 


Once again, the proof proceeds by separate model extension arguments for COH 
and SRT?. Belanger [13], building on earlier results of Hajek (Theorem 7.7.6), 
then obtained a strengthening of the COH result, showing that WKLo + COH is II; 
conservative over RCAg + Bx?. Finally, Slaman and Yokoyama [292] established the 
same conservation level also for SRT”. Thus, we obtain: 


Theorem 8.7.18 (Slaman and Yokoyama [292]). WKLo + RT? is i conservative 
over RCAg + Bx?. 


Combining this with the fact that SRT” implies Bx? (Theorem 8.7.2), it follows that 
the first order part of RT? (and indeed, SRT? as well as WKL + RT’) is Bx3. 

When the number of colors is fixed (and at least 2), Theorem 8.7.1 gives Ize as 
an upper bound, while Hirst’s theorem (and the fact that RTS — RT! over RCAo) 
gives Bx} as a lower bound. The first order part in this case, while not exactly pinned 
down, is known to lie closer to the latter. 


Theorem 8.7.19 (Chong, Slaman, and Yang [40]). RCAo + RT5 ¥ 12). 


Theorem 8.7.20 (Patey and Yokoyama [246]). WKLo + RT3 is = conservative 
over RCAg + Bx?. 


Whether the first order part of RT; is exactly Bey remains open. 


Question 8.7.21. Is RCAg + RTS is II; conservative over RCAg + Bx9? 
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But pa formulas already include a large segment of natural arithmetical statements, 
in particular, all consistency statements, and all VA theorems of arithmetic. Indeed, 
one consequence of Theorem 8.7.20 is that RT; is “finitistically reducible”, in the 
sense discussed in the introduction, a particularly striking fact given that Ramsey’s 
theorem (in the form we are discussing here) seems so intrinsically to be about 
infinite sets. (The interested reader may enjoy looking at the March 2016 edition of 
Quanta Magazine, where a popular account of Theorem 8.7.20 and its implications 
is presented.) 

It is also still open whether, like for the versions with arbitrarily many colors, the 
first order strength of RT3 is the same as that of SRT}. More generally, elucidating the 
precise relationship between Ramsey’s theorem for pairs and the stable Ramsey’s 
theorem for pairs has been, and is still, one of the central themes in computable 
combinatorics and the reverse mathematics of combinatorial problems. We dedicate 
the next section to exploring this question in more detail. 

We conclude this section with one more result, concerning the first order strength 
of COH. In Theorem 7.7.1 we proved that COH is 1; conservative over RCAg, mostly 
as a warm-up for Harrington’s theorem (Theorem 7.7.3). We cannot improve this to 
iL, of course, since COH is itself a 1, statement. But as it turns out, we can improve 
it to a restricted class of II, Statements. 


Definition 8.7.22 (Hirschfeldt and Shore [152]). A £2 theory 7; is restricted II, 
conservative over an £2 theory 7 if every sentence of the form 


(VX) (p(X) > (AY)W(X.Y)], 
where ¢ is arithmetical and w is py that is provable in T; is provable in 7). 
We will prove the following result. 


Theorem 8.7.23 (Hirschfeldt and Shore [152]). RCAg + COH is restricted II, 
conservative over RCAo. 


Note that COH itself has the form (VX)[y(X) — (AY)W(X, Y)] where ¢ is arith- 
metical and w is m1. To prove the theorem, we begin with the following lemma. 


Lemma 8.7.24. Let M be a countable model RCAo and suppose wh is a bas formula 
such that M & (YY)[=W(A,Y)] for some A ¢ S™. Then if G is sufficiently generic 
for Mathias forcing in M, M[G] § (VY) [-W(A, Y)]. 


Proof. We prove the contrapositive. Fix G, and suppose there is a B € M[G] such 
that M[G] & w(A, B). As B is A!-definable from G and parameters in M, we can 
fix an e € w such that B = ®,.(G @ C) for some C € S™. Consider the formula 


®,(Z @ W) is total, {0, 1}-valued, and w(X, ®,(Z @ W)). (8.5) 


Over RCAg, this is equivalent to a 181 formula. Let 6(X, W, Z,x, y) bea pa formula 
so that RCAg proves that (8.5) is equivalent to (Vx)(Ay)@(X, W, Z, x, y). Since G is 
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generic, there is a condition (£,/) such that M[G] — E CG C EU/J and (E,/) 
forces (Vx)(Ay)0(A, C,G,x, y) in M. So for every b € M, every extension (E*, /*) 
of (£,/) has a further extension forcing (Sy)@(A, C,G, b, y). By Exercise 7.8.10, 
the latter can always be achieved by an Mc-finite extension of (£*, /*). Hence, by 
recursion in M, we can define a sequence (E, : b € M) of M-finite sets with 
E = Eo and satisfying the following for every b € M. 


© Ey © Eps, U Ep VI. 
© Eni \ Ep is M-finite. 
° Eps forces (Ay)@(A, C,G, x, y). 


As M & RCAo, the sequence (Ep : b € M) is in S™) and consequently so is 
H = Unem Ep. By Exercise 7.8.9, M & (Vx)(Ay)6(A, C, H, x, y). In other words, 
M satisfies that ®.(H © C) defines a set and that (A, ®.(H © C)) holds. But 
®.(H ®C) € S“, so Met (AY)W(A, Y), as was to be shown. Oo 


Proof (of Theorem 8.7.23). Fix an arithmetical formula ~ and a z formula yw, and 
suppose M is a countable model of RCAg satisfying =(VX)[y(X) > (AY)W(X, Y)]. 
We can fix A ¢ S™ such that M & y(A) A (VY) [AW(A, Y)]. 

Let No = M, and for i € w, let G; be sufficiently generic for Mathias forcing in 
N; and let Ni+1 = N;[G;]. Note that N; is an w-submodel of N;4; for all 7. Hence, 
all the N;; have the same first order part as M, as does N = Ujew Ni. By the proof 
of Theorem tel, Nj; & RCAo for all 7, and moreover, if R is an instance of COH in 
N; then G; is R-cohesive. It follows that VW — RCAg + COH. 

We claim that NV — 7=(VX)[y(X) — (AY)w(X,Y)], which proves what we 
want. Since M is an w-submodel of N, A € SY%, and since y is arithmetical 
we have N F§ g(A). By the lemma, it follows by (external) induction on i that 
Ni & (VY)[=W(A, Y)]. Since every set in S* belongs to S“ for some i, we must 
have N § (VY)[-=W(A, Y)] as well. The proof is complete. oO 


8.8 The SRT} vs. COH problem 


The Cholak-Jockusch—Slaman decomposition raises an immediate question, which 
is whether it is actually proper. That is, we should like to know if either of SRT; 
or COH already implies RTS over RCAg—by itself. Of course, this is equivalent to 
asking if either of SRT” or COH implies the other. This has come to be called by 
some the SRT; vs. COH problem. 

We already know part of the answer. By Theorem 7.7.1, COH is II;-conservative 
over RCAp, whereas RT; is not since it implies Bx?. 


Corollary 8.8.1. RCA ¥ COH > RT} (or SRT3). 


Hirschfeldt, Jockusch, Kjos-Hanssen, Lempp, and Slaman [149] showed it is also 
possible to prove this using a computability theoretic argument instead of a proof 
theoretic one. This gives a stronger separation, witnessed by an w-model. 


252 8 Ramsey’s theorem 


Theorem 8.8.2 (Hirschfeldt, et al. [149]). RT <.. COH. 


Proof. The proof makes use of the principle DNR, introduced in Definition 4.3.9. 
The first step is to show that DNR <,, RTS, and this is left to the exercises (Ex- 
ercise 8.10.4). We show that DNR <,, COH, which gives the theorem. Let C be 
the class of all sets Y that compute no DNC function. Then C is closed downward 
under <7, and by definition, DNR does not admit preservation of C (in the sense of 
Definition 3.6.11). By Theorem 4.6. 13, it therefore suffices to show that COH admits 
preservation of C. That is, we must show that for every set A that computes no DNC 
function, every A-computable instance of COH has a solution Y such that A @ Y 
still computes no DNC function. We prove this in the case A = ©. The full result 
follows easily by relativization. Let R = (R; : i € w) be a computable instance of 
COH. We force with Mathias conditions with computable reservoirs. As discussed in 
Example 7.3.9, any sufficiently generic set G for this forcing will be R-cohesive. We 
claim no such G computes a DNC function. To this end, it clearly suffices to show 
that for each Turing functional I’, the set of conditions forcing, for some e € w, either 
that P(e) |= ®,(e) J, or that P(e) 7, is dense. To see this, fix P and any condition 
(E, 1). If there is an e and a finite set F C J such that P='¥ (e) |= ®.(e) |, then 
(E U F,{x € 1: x > F}) is an extension of (E,/) forcing P(e) |= ®.(e) |. If, 
instead, there is an e such that TY" (e) 7 for all finite F C J, then (£, J) itself forces 
T%(e) T. So assume neither is true. We derive a contradiction. By assumption, for 
each e there is a finite set F C J such that P"""'(e) |, and for this e, if ®.(e) | then 
it must be the case that TF*Y’¥ (e) # ®.(e). But then since J is computable we can 
compute a DNC function f as follows: given e, search for the least F ¢ J such that 
TrfYF (e) |, and let f(e) = F&Y¥ (e). Since no DNC function can be computable, 
the proof is complete. oO 


What about SRT3: does it imply ae or equivalently, COH? The question was 
first asked by Cholak, Jockusch, and Slaman [33]. Over the next decade and a half, 
it saw a spectacular amount of interest. Particularly after the proof of Liu’s theorem 
(Theorem 8.6.1), it became the question in the reverse mathematics of combinatorics. 
To be sure, there are some obvious and immediate differences between SRT; and 
AT2. For instance, by Corollary 8.4.7 and Theorem 8.2.2, SRT3 admits A solutions 
whereas RTS omits AS solutions. This yields: 


Corollary 8.8.3. RTS <. SRT}. 


But by Theorem 8.4.13, COH also admits AS solutions, so this does not settle whether 
or not COH is computably reducible to SRT3. And of course, <, measures only one- 
time computable transformations, not repeated applications as might be found in an 
w-model reduction or a proof in RCAo. 

Several ideas were proffered for separating SRT; from RTS, focused on isolating 
a computability theoretic property, enjoyed by SRT} and failed by RT3, that could be 
iterated to produce an w-model satisfying the former and not the latter. Cholak, Joc- 
kusch, and Slaman [33] asked whether SRT might admit low solutions, which when 
combined with Theorem 8.2.2, would yield the desired w-model. Unfortunately, this 
possibility was quickly quashed. 
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Theorem 8.8.4 (Downey, Hirschfeldt, Lempp, and Solomon [75]). SRT; omits 
low solutions. 


We will discuss the proof of this result below. Attempts were then made to show that 
for some n > 1, SRT; admits low, aN solutions, which can also be used to produce 
an w-model separation. This remains open, but ultimately was not used to answer 
the main question. Hirschfeldt, Jockusch, Kjos-Hanssen, Lempp, and Slaman [149] 
obtained a partial result, showing that SRT; admits incomplete nN solutions (i.e., 
every A-computable instance of SRT> has a solution H with H@ A <r A’). But 
this cannot be iterated, and so does not yield an w-model witnessing a separation. 
More complicated degree theoretic properties were later considered by Kach and 
Solomon [174], who also proposed a framework for organizing and studying such 
properties, and the resulting w-models, more generally. Alas, the specific properties 
they looked at did not yield a separation either. 

In the end, the solution was finally obtained, in some sense, by going back to the 
very beginning. Given a model M and set X ¢ S™, say X is low in M if every rN 
formula having at most X as a set parameter is equivalent in M to a A formula with 
no set parameters. (Here, ag formula’ is being used as in Section 8.7.3.) 


Theorem 8.8.5 (Chong, Slaman, and Yang [39]). There exists a countable model 
M satisfying RCAp + SRT; such that every X € S™ is low in M. 


By contrast, it is not difficult to formalize the n = 2 case in the proof of Theorem 8.2.2 
in RCAo + Bx?. That is, RCAg + Bx) proves that RT; omits x (and hence also low) 
solutions. It follows that the model M above cannot also satisfy RT5. And so we 
have the answer: 


Corollary 8.8.6 (Chong, Slaman, and Yang[39]). RCAg ¥ SRT} > RT}. 


The proof is remarkable in a couple of ways. Most visibly, it is not an w-model 
separation, unlike the earlier attempts. Indeed, M above is designed specifically 
so as to allow something that would be impossible if it were standard, namely, for 
Theorem 8.8.4 to fail and SRT3 to admit low solutions. Thus, this is not a purely 
proof theoretic result either (like the conservation theorems we have seen), as there 
is an important interplay between the first order and second-order parts. 

Effectively, Theorem 8.8.5 shows that the original attempt at separating SRT} and 
RTS, using low sets, does work, just over a nonstandard universe. This means that 
the proof of Theorem 8.8.4 requires more induction to formalize than is available 
in the model M. The proof there is an infinite injury priority argument, and as 
pointed out e.g. in [218], it can be carried out in RCAg + [e. That fits, as Chong, 
Slaman, and Yang [39] show that their model M satisfies alz2. On the other hand, 
M certainly satisfies BES. since this follows from SRT3. So, we get an interesting 
reverse recursion theory result: any proof of Theorem 8.8.4 necessarily makes use of 
at least some induction above BXS. In particular, per our discussion in Section 6.4, 
the theorem cannot be proved by any (typical) finite injury argument. 
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The question thus naturally turns to w-models. This has seen another long string 
of attempts at a solution, including revivals of some of the earlier ideas that preceded 
the Chong—Slaman—Yang result. These were joined by new efforts that focused 
on finer reducibilities as a way to “work up” to an w-model separation. In light 
of Corollary 8.8.3, that RT; Lo SRT, these looked to compare COH and SRT 
directly. Dzhafarov [79, 80] showed that COH <w SRT? and COH <¢¢ SRT}. Later, 
Dzhafarov, Patey, Solomon, and Westrick [90] extended the latter result by showing 
that COH <,. SRT7. (As we will see in Section 9.1, SRT? is strictly stronger than 
SRT; under both <w and <gc. This is analogous to Propositions 4.3.7 and 4.4.6). 
We will say more about these partial results at the end of this section. Later on, we 
will encounter some of the techniques developed by these results, which have since 
found applications also to other problems. (One example is the tree labeling method 
discussed in Sections 9.1 and 9.2.) 

The w-model separation was found at long last in 2019, once again by rather 
different methods than those that had been tried up to that point. 


Theorem 8.8.7 (Monin and Patey [218]). For all A,C € 2@ with A’ + C, every 
A-computable instance of SRT3 has a solution H so that (A ® H)’ > C. 


As noted following Theorem 8.4.13 above, every w-model of COH must contain a 
set X with X’ >> ©’. But by iterating and dovetailing the above theorem (taking 
C = ©), itis possible to obtain an w-model of SRT} that contains no such X. Hence, 
we obtain: 


Corollary 8.8.8 (Monin and Patey [218]). COH ¢., SRT (and so RT5 €.) SRT3). 
Hence also RCAg ¥ SRT” — COH (and so RCAo ¥ SRT* > RT5). 


Recall that SRT} does not imply SRT? over RCAg by Corollary 8.7.3, so in fact the 
second part above is a stronger conclusion than Corollary 8.8.6. 

The statement of Theorem 8.8.7 is essentially that of Liu’s theorem, but one 
jump up. (Accordingly, some authors call the property expressed there jump PA 
avoidance.) Liu’s proof may seem tantalizingly close to being directly modifiable to 
give this result, but this is somewhat illusory. Indeed, there are significant technical 
obstacles, most notably because the relevant formulas to force are now one quantifier 
more complex. The innovation of the Monin—Patey proof is the development of a 
suitable notion of largeness that keeps the complexity of forcing these formulas down 
(which, as per in Remark 7.7.2, is crucial). This is the key breakthrough that was 
missing from earlier attempts and sets their argument apart. 

Let us conclude this section by looking at what else can be asked about SRT; 
and COH. The following definition has come up independently in several works 
(e.g., [79, 148, 243]) but was first isolated and named by Monin and Patey (in work 
unrelated to their proof of Theorem 8.8.7). 


Definition 8.8.9 (Monin and Patey [217]). Let P and Q be problems. 


1. P is omnisciently computably reducible to Q, written P <oc Q, if for every 
P-instance X there is a Q-instance X such that if Y is any Q-solution to X then 
X ® Y computes a P-solution to X. 
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2. P is strongly omnisciently computably reducible to Q, written P <so¢ Q, if for 
every P-instance X there is a Q-instance X such that if Y is any Q-solution to X 
then Y computes a P-solution to X. 


The emphasis here is on the fact that the Q-instance X need not be computable from 
the P-instance X, or be effective in any other way. Clearly, <oc implies <go¢ which 
implies <sc, and <oc also implies <,. These reducibilities thus eliminate some of the 
computational dependence between two problems, and so in a sense, allow for their 
comparison on more purely combinatorial terms: is one problem “combinatorially” 
reducible to another? A good example that illustrates this is the following, which is 
immediate from Propositions 8.4.4 and 8.4.5. 


Proposition 8.8.10. For all k > 1, D? soc RT. 


Indeed, to someone uninterested in computability theoretic considerations, finding 
a limit homogeneous set for a stable coloring of pairs is no different than finding a 
homogeneous set for a coloring of singletons. 

These reducibilities are thus quite interesting in the specific case of the SRT} 
vs. COH problem. Several things here are known. For one, we have the following 
considerable strengthening of the aforementioned result that COH <,¢ SRT?. 


Theorem 8.8.11 (Dzhafarov, Patey, Solomon, and Westrick [90]). COH coc 
SRT?. 


(See also Theorems 9.1.1 and 9.1.10 below, which are related.) This is quite different 
from the original question, whether COH <,, SRT}. But it gives new insight into 
the relationship between the two principles, especially when combined with the 
following. 


Theorem 8.8.12 (Cholak, Dzhafarov, Hirschfeldt, and Patey [28]). COH <o. 
SRT. 


Proof. Let R= (Ro, Ri, ...) be a given instance of COH. We use the notation R,, for 
o € 2<”, defined in the proof of Theorem 8.4.13. Let c: [w]” — 2 be the following 
coloring: for x < y, set 


Ga 0 if (Go €25°)(Az 2 y)[|o| =x Az € Ro and Rg is finite], 
c(x,y) = . 
- 1 otherwise. 


We claim that lim, c(x, y) = 1 for all x, and so in particular that c is an instance of 
SRT}. Given x, fix the least m,. > x such that for all o of length x, if R, is finite 
then Rg < m,. Then c(x, y) = 1 forall y > my. 

Let H = {ho < h, < ---} be any infinite homogeneous set for c, necessarily 
with color 1. The minimality of m, above implies that, for each x, c(x, y) = 0 for all 
x < y < mx (so the value of c(x, y) changes at most once). From here it is readily 
seen that h,4; > mp, > mx for all x. 
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__ We now define a sequence of binary strings 0) < o1 < --- computably from 
R @ H. Let op = (), and assume inductively that o,_; has been defined for some 
x > Oand that R,,_, is infinite. At least one of Ro, , A Rx and Ro, OR, is infinite, 
so we can search for and find a z > A,4) in one of these two intersections. Let a, 
be o,_10 or 0,1, depending on which of the two we find the least such z in. Now, 
Z > myx, So Ro, is infinite. 

Now let yo = min Ray, and having defined y,_; for some x > 0 let yx be the least 
element of R,, larger than yy-1. Then S = {yy : x € w} is an infinite R-cohesive 


set computable from the sequence op < a; <--- and hence from ROH. oO 


The above proof exploits the sparsity of homogeneous sets, much like the proof 
of Proposition 8.5.5. As there, the same argument would not work for D? (or equiva- 
lently, RT', since we are comparing under omniscient reductions), because w is limit 
homogeneous for c. And so we are led naturally to the following key question. 


uestion 8.8.13. Is it the case that COH <o¢ D? (or equivalently, RT!)? 
q y 


This is the distillation of the SRT; vs. COH problem to its most combinatorial form, 
and will likely require rather different methods to tackle. 


8.9 Summary: Ramsey’s theorem and the “‘big five’’ 


Having studied the computability theoretic properties of Ramsey’s theorem in some 
detail, we are now prepared to locate the theorem and its restrictions within the 
hierarchy of subsystems of second order arithmetic. 


Theorem 8.9.1. 
(Ramsey’s theorem for singletons): 


1. For each k > 1, RCAp + Ale 
2. RCA + RT! © BEY. 


(Ramsey’s theorem for pairs): 


3. RCAg F RT}. 

. ACAg + RT’. 

. Over RCAg, RT? and WKL are incomparable. 

. For all k > 1, RCAg + RT? > RT}. 

. RCAg ¥ RT? > SRT?. 

. For all k > 2, RCAg t RT; < SRTz + COH. 

. RCAg + RT? <> SRT? +COH. 

. RCAg ¥ COH > SRT3 and RCAy ¥ SRT? — COH. 


So mananskr 
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(Ramsey’s theorem for exponent n > 3): 


11. RCAg + RT”, 
12. For all k > 2, RCAg + RT” © RT” © ACAo. 


Proof. Parts (1) is trivial. Part (2) is Hirst’s theorem (Theorem 6.5.1). Part (3) is 
trivial. Part (4) is proved in Proposition 8.1.5. Part (5) follows by Liu’s theorem (The- 
orem 8.6.1) and the preceding discussion. Part (6) is clear. Part (7) is Corollary 8.7.3. 
Parts (8) and (9) are in Theorem 8.5.1. The first half of part (10) is Corollary 8.8.1, 
and the second is the Chong—Slaman—Yang theorem and Monin—Patey theorem 
(Theorems 8.8.5 and 8.8.7). Finally, (11) is trivial, and (12) is Corollary 8.2.6. oO 


The diagram in Figure 8.2 gives us our first snapshot of the rich and complicated 
tapestry of relationships between combinatorial principles. This has come to be 
called the reverse mathematics zoo, and will be the subject of Section 9.12 below. 
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Exercise 8.10.1 (Jockusch [168]). Fix A > @’ andn,k > 1. Show that every 
computable c: [w]” — k has an infinite A-computable pre-homogeneous set. 


Exercise 8.10.2. Show that if f: w — w is total and dominates every partial com- 
putable function then 2’ <7 f. o"-), 


Exercise 8.10.3. Show that RCAy + SRT; > RT. 


Exercise 8.10.4 (Hirschfeldt, Jockusch, Kjos-Hanssen, Lempp, and Solomon 
[149]). Show that DNR < D2, (Hint: First, using an argument similar to the 
n = 2 case of Theorem 8.2.2, show that there is a set P <r @’ such that for all e, if 
We is a subset of P or P then |W.| < 3e +2. Next, suppose L is an infinite subset of 
P or P, and let g be the L-computable function such that for all e, We(¢) consists of 
the least 3e + 2 many elements of L. Show that Wg (¢) # We for all e. Finally, apply 
Exercise 2.9.16.) 


Exercise 8.10.5. Show that RCAo + COH proves the following: if S is an infinite set 
and R is a family of sets, then there exists an infinite R-cohesive set X C S. 


Exercise 8.10.6. Use Theorem 8.4.13 to show that COH omits solutions of hyper- 
immune free degree. 


Exercise 8.10.7 (Hirschfeldt and Shore [152]). Let CRT2 be the statement that 
for every coloring c: [w]* — 2 there is an infinite set § such that c [[S]? is stable. 
Show that RCAg + BE} + CRT; < COH (and so COH =, CRT3). 


Exercise 8.10.8. The Paris—Harrington theorem (PH) is the following result strength- 
ening the finitary Ramsey’s theorem (FRT) from Definition 3.3.6: for alln, k,m > 1 
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ACA) <=> RT 


ACAp < > RT” < > RTE 
RT? <——> SRT? + COH 
WKLo D? <—> SRT’? RT; <——> SRTz +COH 
Di <—— SRT; 


RCAy + BE) < > RT! COH 


RCAo 


Figure 8.2. The location of versions of Ramsey’s theorem below ACA). Here, n > 3 and k > 2 
are arbitrary. Arrows denote implications over RCAg; double arrows are implications that cannot 
be reversed. No additional arrows can be added. 


with m > n, there is an N € w such that for every finite set X with |X| > N, every 
c: [X]" — k has a homogeneous set H € X with |X| > max{m, min X}. Show that 
RCAg + RT — PH. It is a famous result of Paris and Harrington [238] that PH is not 
provable in PA. (For a proof of this fact, see Kaye [176, Section 14.3].) Conclude by 
Corollary 5.10.7 that RT is not provable in ACAo. 


Exercise 8.10.9. Complete the proof of Lemma 8.7.7 by proving parts (1) and (2). 


Chapter 9 ®) 


Check for 


Other combinatorial principles i 


Seetapun’s theorem (Theorem 8.3.1) and the follow-up work of Cholak, Jockusch, 
and Slaman [33] led to the realization that not only are there natural principles 
defying the “big five” phenomenon in reverse mathematics, but that there may be 
many of them. The interest in finding more examples, and understanding why their 
strength is so different from that of most other theorems studied before, has grown 
into a massive research program. Many dozens of other such principles have now 
been identified. Curiously enough, most of them (though not all) have turned out to 
come from combinatorics and, indeed, are related to Ramsey’s theorem for pairs in 
one way or another. 

The goal of this chapter is to give a partial survey the results of this investiga- 
tion, highlighting recent developments and connections to currently open problems. 
Another overview is given by Hirschfeldt [147]. The overlap between that treatment 
and ours is minimal, and we feel the two accounts are quite complementary. 


9.1 Finer results about RT 


We have seen that Ramsey’s theorem exhibits different complexity bounds based on 
the exponent, but not based on the number of colors. However, in Chapter 4, we saw 
that versions of Ramsey’s theorem for different numbers of colors with the same 
exponent could be separated using finer reducibilities. In this section, we continue 
this investigation and obtain very precise results about the principles RT}. 


9.1.1 Ramsey’s theorem for singletons 


We begin by looking at RTj. In Proposition 4.3.7 (Exercise 4.8.4) we saw that if 
k > j then RT; <w RT}. In Proposition 4.4.6, we also mentioned (without proof) 
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an analogous result for <,,: if k > j then RT; Lea RT}. We now prove this result, in 
the following considerably stronger form. 


Theorem 9.1.1 (Dzhafarov, Patey, Solomon, and Westrick [90]). For all k > j, 
RT; &sc SRT}. 


We must exhibit an instance c of RT B and for each Turing functional ® for which 
@° is an instance of SRT;, an infinite homogeneous set H for ®© that computes no 
infinite homogeneous set for c. This is achieved via a cardinality argument, similar 
to Proposition 4.3.7. But there are two complicating factors. First, H must now avoid 
computing an infinite homogeneous set for c via all possible Turing functionals, not 
just a single one. And second, more problematically, H must be homogeneous for a 
stable coloring of pairs, which is harder to ensure than being homogeneous set for 
a coloring of singletons, or indeed, being limit homogeneous for a stable coloring 
of pairs. More precisely, because we need to control the internal structure of H (the 
colors between the elements of H), we always need ®° to already be defined on 
the elements we wish to add to H, and this in turn requires more of c to already be 
defined. This causes a tension with diagonalizing the infinite homogeneous sets of c. 

Before discussing how to overcome this obstacle, let us define the components 
of the proof, state the main technical lemma we will need, and then see how The- 
orem 9.1.1 follows. We use separate forcing constructions for the construction of 
our RT; instance c and our SRT solution H. For c, we use Cohen forcing with 
conditions 0 € k<®; for H, we use Mathias forcing. However, the two constructions 
are necessarily intertwined. For the rest of this section, let k > j be fixed. 


Definition 9.1.2. Fix € < j and a Turing functional ®. A Cohen condition 0 € k<@ 
is (®, €)-compatible with a Mathias condition (E, /) if the following hold. 


° a + ® is astable coloring [w]” — j for some j > ¢. 
°c t OS(x, y) = € forall x,y € E. 
°o tt (Vy > minJ)[®%(x, y) = j] forallx € E. 


Definition 9.1.3. We define the following notion of forcing. 


1. Conditions are finite tuples p consisting of a Cohen condition 0” € k<“®,a finite 


set S? of Turing functionals, and for each ® € S?, finite sets ES genes Er j-l 


and an infinite set / i“ such that the following hold. 


°c?  @ is a stable coloring [w]* > j. 


* For each ¢ < j, (EG ,, 1) is a Mathias condition. 


* For each € < j, 7? is (®, £)-compatible with (EQ ,, 3). 


2. Extension is defined by g < p if 7% > a0 ?, S4 3 S?, and for each ® € S? and 
<j, (Fae 1s (EE eto) as Mathias conditions. 

3. The valuation of a condition p is the join of the valuation of 0” as a Cohen 
condition with the valuations, for each ® € S?, of the Mathias conditions 


(EE eT sain ES paglh)s 
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Observe that for each Turing functional ®, the set of conditions p having ® € S?, or 
else having o-? force that ®S is not a stable coloring [w]? — j, is dense. Hence, a 
generic for the above forcing consists of an instance c of RT!, and for each ® such 
that ®°° is an instance of SRT;, a j-tuple of homogeneous sets Be os ioe He 64s 
By genericity, and the definition of compatibility, it is easily seen that there is at least 
one € < j such that He is infinite. 

We will only deal with the forcing relation with respect to Cohen conditions, and 
so continue to use the symbol G for the name of the generic RT; instance c@ in the 
k< forcing language. Our main objective will be to prove the following density 
lemma. 


Lemma 9.1.4. Fix i < k, € < j, and Turing functionals ® and ¥. The set of 
conditions p satisfying one of the following properties is dense. 


1. Forallx € its oP? It a(limy ®S(x, y) = €). 


2. For all finite sets F ¢ i and all sufficiently large x € w, peo cUF (x) ~ 0. 
3. For eachx € i there exists a unique number w* € w such that for allt = a”, 
if T(w*) =i then T It a(limy ®S(x, y) = €). 


4, There exists aw € w such that 0? (w) =i and weoe (w) j=1. 


The theorem now easily follows. 


Proof (of Theorem 9.1.1). Choose a sufficiently generic filter for the forcing in 
Definition 9.1.3, and let c% be as above. Suppose °° is an instance of SRT¥, and 


G G 
let Ae ante Hg. 


j-l again be as above. We will show that there is an £ < j such that 
He , is infinite and we is not an infinite homogeneous set for c, for any Turing 
functional ¥. 

Seeking a contradiction, suppose otherwise. Let C be the (necessarily nonempty) 
set of all € < j such that He , 18 infinite. We may thus fix a condition po € F 
such that, for each x € f° and each ¢ ¢ C, a? forces =(limy ®%(x, y) = €). Now, 
for each € € C, fix a Turing functional Yr via which He , computes an infinite 


homogeneous set for c%, say with color ig < k. Since j < k, we may fix ani < k 
such that i # i for all € < j. 

Apply Lemma 9.1.4 repeatedly with i, ®, and ‘Yr for each f € C in turn, to 
obtain a condition p < po in ¥. By assumption, properties (2) and (4) cannot 
hold for any € € C, since EG , © ee c 18, oP < c%, and Y maps Ae i onto 
an infinite homogeneous set for c% with color different from i. And by genericity, 
property (1) cannot hold for any @ € C, since ees , 18 homogeneous (and hence 


limit homogeneous) for °° with color ¢. Hence, p witnesses that (3) holds for each 
CEC. 

It follows that for each € € C, each number x € he corresponds to a unique 
number wy € w such that if t is any extension of 0? with t(w?) = i then 7 IF 
=(limy ®S(x, y) = €). The uniqueness means we can choose an x such that wr 2 |oP| 
for each € € C. Then we can findat = a? with t(w;) = i forall € € C. (Itmay be that 
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Ww, = wy. for some € # €*, but this does not matter.) So t forces (lim, @S(x, y) =) 
for all € € C. But x € If C If? and t > a, sot F A(limy ®S(x, y) = 2) for all 
€ < j, which cannot be. oO 


We move on to prove Lemma 9.1.4. Let us fix i < k, € < j, and the functionals 
® and ¥ for the remainder of this discussion. We may assume is {0, 1 }-valued, so 
that if ¥7 is total for some Z then we can view it as a set. 


Definition 9.1.5. Fix n € w and a Mathias condition (E, J). We define T(n, E, 1) ¢ 
I< as follows. 


°() €T(n,E,I). 
¢ A nonempty string a € 1<“ belongs to T(n, E, J) if and only if @ is increasing 
and 
(WF C range(a*))(Vx > n) [WEF (x) ~ OJ. 


The following basic facts about T(n, E, J) are straightforward to check. The key 
point to remember in, e.g., (2) or (3), is that since (E, I) is a Mathias condition, / is 
an infinite set. 


Lemma 9.1.6. Fix n € w, a Mathias condition (E,1), and let T = T(n, E,1). Then 
T has the following properties. 


1. T is closed under initial segments, so T is a tree. 

2. Ifa € T is not terminal, then ax € T for all x € I with x > range(q). 

3. If a € T is terminal, then there is an F © range(@) and an x > n such that 
WEVF (x) [= 1. 

4. If T is not well founded and I* = range(f) for some f € [T] then for all sets Z 
with EC ZC EUTI", either ¥Z is not total or V7 is a finite set. 


The main combinatorial construct we need is a labeling of the nodes in this tree, 
which we use as a guide for adding elements to the homogeneous sets we are building. 
The method used in the proof of Lemma 9.1.4 is accordingly called the tree labeling 
method. It was originally developed by Dzhafarov [80], but then extended for the 
proof of the present theorem. We will mention other applications of this method 
later. 


Definition 9.1.7 (Labeling and labeled subtree). Fix n € w, a Mathias condition 
(E, 1), and let T = T(n, E, 1). Suppose T is well founded. 


1. For each terminal a ¢€ T, the least F € range(@) andx > nas in Lemma 9.1.6 (3) 
are denoted F® and w®, respectively. 
2. The label of a node a € T is defined inductively as follows. 


¢ If a is terminal, its label is w®. 

¢ Suppose a@ is not terminal, and every 8 = a@ € T has been labeled. If 
infinitely many immediate successors of a in T have a common label w, the 
label of @ is the least such w. Otherwise, the label of a is oo. 
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3. The labeled subtree of T is the tree T’ = T'(n, E,1) defined inductively as 
follows. 


e(eTt. 

* Suppose a € T¥ and a is not terminal in 7. If @ has infinitely many 
immediate successors in T with the same label as its own (either numerical, 
or oo), then each such successor is in T". Otherwise, a has immediate 
successors with infinitely many numerical labels w, and for each such w, the 
least successor of a with label w is in TY. 


We again collect some basic facts about this definition. 


Lemma 9.1.8. Fix n € w, a Mathias condition (E,1I), and let T = T(n,E, 1). 
Suppose T is well-founded and let T’ =T"(n, E, 1). 


1. a@ €T* is terminal in T® if and only if it is terminal in T. 

2. Ifa € T™ has label w € w, then so does every B = ainT", andw > n. 

3. Each nonterminal a € T™ has infinitely many immediate successors in T", and 
these either all have the same numerical label w, in which case so does a, or 
they all have label ox, in which case so does a, or they each have a different 
numerical label, in which case a has label o. 

4. If a € T is terminal then its label is some w € w, and there is an F © range(@) 
such that ®EYF (w) |= 1. 


We will also need the following less obvious fact, which will be the main tool used for 
building homogeneous sets in the proof of Lemma 9.1.4. We say a Cohen condition 
ao € k<® forces that a finite set F is homogeneous and limit homogeneous for ®& 
with color € if o + ®%(x, y) = €, for all x, y € F, and o It lim, ®%(x, y) = @, for all 
xeF, 


Lemma 9.1.9. Fix n € w, a Mathias condition (E,1I), and let T = T(n,E, I). 
Suppose T is well-founded and let T’ = T"(n, E, 1). Suppose 7 € k<® is (®,€)- 
compatible with (E, I). Then one of the following possibilities holds: 


1. There is ao* = o and an infinite set I* C I such that 


ot 7(lim ®%(x, y) = €) 
y 


for eachx € I". 

2. Ifa € T" and forces that EUrange(q) is homogeneous and limit homogeneous 
for ®£ with color €, then there isao* > 0 andana®* > a such that a’ is terminal 
inT™ and o* forces that E Urange(a*) is homogeneous and limit homogeneous 
for ®§ with color €. 


Proof. Suppose 7 € k<“ is (®, £)-compatible with (£, /), and that no o* and J* as 
in part (1) exist. Leta € T be as in (2). We define a finite sequence 09 X 01 <°:°: 
of Cohen conditions and a finite sequence aj < qa < -:- of elements of TE. Let 
Oo = 0 and a =a. Fix s > 0, and assume inductively that we have defined o, and 
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as, and that o, forces that a, is homogeneous and limit homogeneous for © with 
color €. If as is terminal, we do nothing. Otherwise, fix b such that a, Ik (Vy > 
b)[®%(x, y) = €] for all x € range(a,) and let * = {x € 1: x > bA asx € T*}. 
Then /* is infinite by definition of T”. If there were no x € I* andno Tt > os forcing 
that lim, ®°(x, y) = € then (1) would apply with o* = o, and /*, a contradiction. 
So we may fix some such tT and x. Let 054; = T and let a54; = asx. Clearly, the 
inductive hypotheses are preserved. This completes the construction. Letting 0* be 
the final element defined in the sequence oo, 0), .. ., and a* the final element defined 
in the sequence ao, @,..., we obtain the conclusion of (2). oO 


We can now prove our main lemma, completing the proof of Theorem 9.1.1. 


Proof (of Lemma 9.1.4). Fix a Cohen condition 0 € k<@ and a Mathias condition 
(E,1I) for which o is (®,€)-compatible. We prove there is a o* < o and an 
(E*,I*) < (E,1) such that o* is (®, €)-compatible with (E*,/*) and one of the 
following properties holds. 


1. Forall x € J", 0* t a(lim, ®S(x, y) = 0). 

2. For all finite sets F C J* and all sufficiently large x € w, PF YF (x) ~ 0. 

3. For each x € J* there exists a unique number w* € w such that for all tT > o%, if 
t(w*) =i then t t 7(limy ®S(x, y) = €). 

4. There exists a w € w such that o*(w) =i and P¥ (w) |= 1. 


These are just like (1)-(4) in the statement of Lemma 9.1.4, but this presentation 
is notationally simpler. And since o and (£, J) are arbitrary, this will gives us the 
lemma. We break into cases. 


Case 1: There isao* = o and an infinite set I* € I such thato* + —(limy ®S(x, y) = 
€) for each x € I*. We let E* = E. Then o* and (E%*, J*) satisfy (1) above. 


Case 2: Otherwise. Let n = max{|o|, E} and let T = T(n, E,/). If T is not well- 
founded, pick any f € [7] and let /* = range(f). In this case, let o* = o and 
E* = E. Then by Lemma 9.1.6 (4), it follows that o* and (E*, I*) satisfy (2) above. 
So assume next that T is well-founded. Let T’ = T (n, E, 1). We break into subcases 
based on the label of the root, (). 


Subcase 2a: () has label w € w. Let oo be any extension of 0 with oo(w) = i, 
which exists since w > n > |a|. Now since go is (®, €)-compatible with (E, J) 
since o is, and therefore op forces that E U range(()) = E is homogeneous and 
limit homogeneous for ®© with color €. Since Case 1 does not hold, Lemma 9.1.9 
yields a o* > oo and a terminal a* € T" such that o* forces that E U range(a*) 
is homogeneous and limit homogeneous for ®° with color €. By Lemma 9.1.8 (2), 
the label of a* is w, so by Lemma 9.1.8 (4), there is an F C range(a*) such 
that PEYF (w) |= 1. Let E* = EU F and J* = {x € 1: x > E*}. Then o*% is 
(®, £)-compatible with (E*, /*) and (4) holds. 


Subcase 2b: {) has label ov. Fix a <-maximal a € T¥ with label oo for which there 
is a oy > o forcing that range(a) is homogeneous and limit homogeneous for ®° 
with color €. (Note that such an @ exists because T is well-founded.) Fix b so that 
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ao It (Vy > b)[®S(x, y) = €], for all x € range(~). By Lemma 9.1.8 (4), @ is not 
terminal in T”, so J* = {x € 1: x > b A ax € T"} is infinite. We claim that ax has 
numerical label, for each x € J*. If not, then ax would have label oo for each such 
x, and so the reason for a being maximal would be that no extension of o forces 
lim, ®%(x, y) = €. But then o would force a(lim, ®°(x, y) = €) forall x € 7*, which 
cannot be since we are not in Case 1. So the claim holds. For each x € J*, let w* € w 
be the label of ax. By Lemma 9.1.8 (3), each w* is unique. Now if, for each x € J”, 
every T = Oo with t(w*) = i forces =(lim, ®%(x, y) = €), we can set E* = E, and 
then o* and (E*, /*) satisfy (3) above. So assume not, and fix a witnessing x € I* 
and corresponding T. Since x > b, we have t forcing that range(ax) is homogeneous 
and limit homogeneous for ®°. Lemma 9.1.9 now yields a o* > T and a terminal 
a* € T” extending ax such that o* forces that E U range(a*) is homogeneous and 
limit homogeneous for ®° with color £. By Lemma 9.1.8 (2), the label of a* is w*, 
so by Lemma 9.1.8 (4), there is an F C range(a*) such that P2Y" (w*) |= 1. Let 
E* =EUFand/* = {x €1:x > E*}. Then o% is (®, £)-compatible with (E*, /*) 
and (4) holds. Oo 


Theorem 9.1.1 can be further strengthened using the notion of strong omniscient 
computable reducibility, which was introduced in Definition 8.8.9. 


Theorem 9.1.10 (Dzhafarov, Patey, Solomon, and Westrick [90]). For all k > j, 
RT; Xsoc SRT}. 


We omit the proof, as it requires ideas from set theory that are somewhat outside the 
scope of this text. Basically, because we now need to deal with arbitrary instances of 
SRT;, rather than merely those computable from c@, we lose the ability to directly 
“talk about” these instances in our forcing language. The workaround is to make c@ 
be generic over a suitable countable model of ZFC, diagonalize all RT} instances 
in this model (which we can name), and then apply an absoluteness argument to 
extend the result to RT} instances in general. The combinatorics underlying this 
argument, however, are exactly the same, and the reader who has understood these 
combinatorics and is familiar with basic forcing arguments in set theory should have 
little difficulty following the proof in [90]. 


9.1.2 Ramsey’s theorem for higher exponents 


We now move on to consider RT;, for n > 2. Again, in terms of the analysis in 
Chapter 8, all such versions are equivalent. Hover, under finer reducibilities, it is 
possible to show that if k # j then RT; and RT’ are actually quite far apart. 


Theorem 9.1.11 (Patey [243]). For alln > 2andk > j 2 1, RTY €c RT". 


Dorais, Dzhafarov, Hirst, Mileti, and Shafer [72] showed earlier that RT. <sw 
RT’, and then Hirschfeldt and Jockusch [148] and Brattka and Rakotoniaina [20] 
independently showed that RT; £w RT}. Each of these results used rather different 
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methods. Patey’s theorem above greatly extends these results, and has a remarkably 
elegant proof, which we now present. 

Recall that a set S is hyperimmune if its principal function is not dominated by 
any computable function. Every hyperimmune set is, in particular, immune, meaning 
it has no computable infinite subset. The key definition and lemma we need for the 
proof are the following. 


Definition 9.1.12 (Patey [243]). Fix k > m > 1. A problem P admits preservation 


of m among k hyperimmunities if for every A € 2 and every sequence So,..., Sx—1 
of A-hyperimmune sets, every A-computable instance X of P has a solution Y such 
that at least m many of the sets So,..., Sx-1 are (A @ Y)-hyperimmune. 


Lemma 9.1.13 (Patey [243]). For allk > j > 1, RT; admits preservation of 2 
among k hyperimmunities. 


Let us see how the lemma implies Theorem 9.1.11. 


Proof (of Theorem 9.1.11). Since RT} is computably true and, for all k > 2, RT; is 
not (by Theorem 8.2.2), the result is obvious if 7 = 1. So, we may assume j > 2. We 
first prove the following claim. 


Claim. There exists a A? sequence (So, ...,Sx-1) whose members partition w such 
that for every computable d: [w]" — j there exits an infinite homogeneous set H for 
d and distinct numbers ig, i, < k with the property that every infinite H-computable 


set intersects both S;, and S;,. 


The proof is by induction on n. First, suppose n = 2. By Exercise 2.9.15, there is a 
Ay sequence of sets (So,...,5%—1) whose members partition w and for each i < k, 
S; is hyperimmune. Consider any computable d: [w]? > j. By Lemma 9.1.13, d 
has an infinite homogeneous set H such that for some io,i; < k, Si. and Si, are 
H-hyperimmune, and so in particular H-immune. This means every H-computable 
infinite set intersects both S;, and S;,, as was to be shown. 

Clearly, the above relativizes. So fix n > 2, and assume that the result holds 
for n — | and that this also relativizes. We prove the result for n in unrelativized 
form, but the proof will easily be seen to relativize as well. Fix A >> @’ with 
A’ <r @”, and apply the inductive hypothesis relative to A to obtain a AQ (A) 
sequence (So, ...,.Sx-1). By Post’s theorem (Theorem 2.6.2) have 


MAH M ANCA (OSA), 


so (So,..-,Sx-1) iS Ao. We claim that this is the desired partition. Fix any computable 
d: [w]" — j. By Exercise 8.10.1, d has an A-computable infinite pre-homogeneous 
set P. Define d*: [P]"~! — j as follows: for all x € [P]"~), d*(x) = d(x, y) for 
the least y € P with y > x (which is the same value for any y € P with y > x). 
Then d* is P-computable, and hence A-computable. So, by assumption, there is an 
infinite set H C P homogeneous for d* and distinct numbers ig,i; < k such that 
every H-computable infinite set intersects both S;, and S;,. But H is clearly also 
homogeneous for d. This proves the claim. 
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To complete the proof, we exhibit a computable instance c of RT; such that 
every c-computable (hence computable) instance d of RT’ has a solution computing 


no solution to c. Fix a A? sequence (So,...,5x-1) as in the claim. By iterating 
Proposition 8.4.5 n — | times, we obtain a computable coloring c: [w]” — k whose 
infinite homogeneous sets are precisely the infinite subsets of the $;. But by choice 
of (So,...,Sx-1), every computable d: [w]” — j has an infinite homogeneous set 
H such that no H-computable infinite set is contained in any Sj. oO 


Let us return to fill in the proof of Lemma 9.1.13. The proof uses the Cholak— 
Jockusch—Slaman decomposition by establishing analogous results for COH and for 
Dé. Let us begin with the former. Notice that preserving 2 among k hyperimmunities, 
which is what we wish to establish for RTs, is not a property (like lowness or cone 
avoidance, for example) that iterates well. Thus, for COH, we actually prove a stronger 
preservation property. 


Definition 9.1.14 (Patey [243]). Fix k > m > 1. A problem P admits preser- 
vation of hyperimmunity if for every A € 2 and every sequence So, Sj,... of 
A-hyperimmune sets, every A-computable instance X of P has a solution Y such that 
the sets So, S1,... is (A @ Y)-hyperimmune. 


Lemma 9.1.15 (Patey [243]). COH admits preservation of hyperimmunity. 


Proof. We prove the unrelativized version. Let R= (R; : 1 € w) be a computable 


instance of COH and let So, S1,... be a sequence of hyperimmune sets. Let G 
be sufficiently generic for Mathias forcing with computable reservoirs. Then G 
is an infinite R-cohesive set, and we claim that each of the sets So, $},... is G- 


hyperimmune. To see this, consider an arbitrary condition (£, /) and arbitrary e,i € 
w. We claim there is an extension (E*, /*) of (E, 1) forcing that ®S does not dominate 
Pp;- Define a partial computable f: w — w as follows: given x € w, let F be the 
least subset of J, if it exists, such that ®£¥ (x) | (with use bounded by max F), and 
set f(x) = BUF (x); otherwise, f(x) T. 

Now, if f is not total, then (£, J) forces that OS is not total, so we can just set 
(E*, 1*) = (E,1). Otherwise, f is a computable function, and since S; is hyperim- 
mune there must be an x such that f(x) < pz, (x). In this case, fix F ¢ J such that 
DEF (x) |< pp, (x), and let E* = EU F and I* = {y € Ty > max F}. Now (E*, I*) 
is the desired extension. By genericity, no G-computable function can dominate pz, 
for any 7, hence each of the sets Bo, B),... is G-hyperimmune, which is what we 
wanted to show. oO 


Now let us consider the situation for Di. As in the alternative proof of Seetapun’s 
theorem given in Section 8.5.1, it is more convenient to work with RT}, and so we 


prove a “strong” version of the property we wish to preserve. The relevant definition 
is the following, building on Definition 9.1.12. 


Definition 9.1.16 (Patey [243]). Fix k > m > 1 and let P be a problem. Then P 
admits strong preservation of m among k hyperimmunities if the above holds for 
every instance X of P (computable from A or not). 
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Lemma 9.1.17 (Patey [243]). Forallk > j > 1, RT! admits strong preservation 
of 2 among k hyperimmunities. 


Proof. The proof is by induction on j. If 7 = 1, then the result holds trivially. Indeed, 
there is only one instance of RT}, the constant coloring x +> 0, and this has w as 
a solution. And if So,...,S,-; are A-hyperimmune then of course they are also 
(A @ w)-hyperimmune. So fix j > 1, and assume the result holds for j — 1. Fix 
A € 2, a sequence of A-hyperimmune sets So,..., Sx-1, and a coloring c: w > j 
(not necessarily A-computable). 


Case 1: There is an infinite set I, an € < j, and distinct numbers ig, i, < k such that 
c(x) # € for all x € I and S;, and S;, are (A ® X)-hyperimmune. By relabeling, we 
may assume € = j — 1. Say J = {mg < m, <---}. Define c*: w > j — 1 as follows: 
for all x < y, c*(x) = c(m,). Then c* <p c @1 <y A @I. By inductive hypothesis 
relative to A @ J and using the sequence S;,, S;,, there is an infinite homogeneous set 
H™ for c* such that two among the sets S;, and S;, (i.e., both of them) are (A®/@H")- 
hyperimmune. Let H = {m, : x € H*}, which now is an infinite homogeneous set 


for c. Then H <y I © H*, so also S;, and S;, are (A © H)-hyperimmune. 


Case 2: Otherwise. In this case, we obtain H by forcing. We force with j-fold Mathias 
conditions (£o,...,£;-1,/) such that each of the sets So,...,Sz-1 is (A ® J)- 
hyperimmune, and such that for each € < j and x € E¢, c(x) = €. Thus in particular, 
(@,...,@,w) is a condition, and a sufficiently generic yields sets Go,...,Gj-1 
such that He is homogeneous for c with color €. Let Go,...,G;-1 be names in our 
forcing language for these generic sets. (As usual, these are really abbreviations for 
definitions using the symbol G.) We prove two density claims about this notion. 


Claim I: For eachm € wand € < j, the set of conditions forcing (Ax)[x > mAx € 
Ge] is dense. Fix a condition (Eo,..., E;-1, 1). Since we are not in Case 1, for each 
€ < j there must exist infinitely many x € J such that c(x) = €, because So,..., Sx-1 
are (A @ J)-hyperimmune. So given m and @, fix x > m in I with c(x) = @, let 
Ey = Ee U {x}, Ej. = Ee» for all €° # €, and let J* = {y © 1: y > x}. Then 


(E;,.. Ee ,-/*) is an extension of (Eo,..., £j-1, 7) witnessing the claim. 


Claim 2: For eachi < k andalleo,...,@j-1 € w, the collection of conditions forcing 
(Aj < €) [oo does not dominate ps, | 


is dense. Fix a condition (Eo,...,£j-1,/). We exhibit an extension that forces, 
for some j < @, either that Aer 


oe (x) L< ps, (x). Foreach x € w, let C, be the class of all sets Z = Zy®- --®Zj-1 
as follows. 


is not total, or that there is an x such that 


¢ Zo,...,Ze-1 partition /. 


° For all € <j and all finite sets F C Ze, ® GONE )t. 


Notice that this is a TN(A ® I) class, with index as such uniform in x. 
First, suppose there is an x € w such that C, # @. In that case, using the 
hyperimmune free basis theorem (Theorem 2.8.22) relative to A @ J, we can find 


9.1 Finer results about RT 269 


some Z = Zy) ®--: ® Z;_1 € Cx of hyperimmune free degree relative to A @ J. This 
means that every (A @ J ® Z)-computable function is dominated by an (A @ J)- 
computable function. Since S; is (A ® /)-hyperimmune, it follows that it is also 
(A@®/@Z)-hyperimmune, and so in particular (A® Z,)-hyperimmune foreach € < j. 
Fix € < j such that Z, is infinite, and set (E>: pee _E;_,,1) = (£o,..., EF j-1,Z¢). 
Then (E5. oe EY) I*) is an extension of (Eo,..., £;-1,/) forcing that aad is 
not total. 

So suppose next that C, = @ for every x. Let f: w — w be the function defined 
as follows. Given x € w, it follows by definition of C,, that there exists an m such that 
for every partition Fo,..., Fj; of J | m (i.e., of a finite set into finite sets) there is an 
€ < j and a finite set F C Fe such that oo!) (x) |. Moreover, m can be found 
(A @ I)-computably, and so can the values of all of the resulting computation of the 
form oe (x) | for F contained in some part of some partition of Fo,..., Fj-1 
of I }m. Then f(x) is the supremum of all these computations. This make f an 
(A @/)-computable function. Since S; is (A @ /)-hyperimmune, we can fix an x such 
that f(x) < ps,(x). Now for each ¢ < j, let Fe = {x € 1: x < mAc(x) = €}, so 
that Fo,..., /;-1 partition J [m. Fix € < j such that there is an F C Fy for which 
OASFMPY (x) | so that 2°") (x) < f(x) by definition of f. Let E} = Ee UF, 


E}. = Ee for all &* # €, and let * = {y € 1: y > max F}. Then (E5,... EY _,,1) 


is an extension of (£o,..., £;-1, ) forcing that of (x) < ps, (x). 


Having proved our claims, let Go, ..., Ge_; be sufficiently generic for our notion 
of forcing. By the first claim, for each € < j, Gg is an infinite homogeneous set 
for c with color €. By the second claim, for each i < k and all e9,...,@;-1 € w 
there is a j < € such that oa does not dominate ps,. By Lachlan’s disjunction 
(Lemma 8.3.7) it follows that for each i < k, there is an € = ¢; < j such that psec 
does not dominate ps, for any e € w, meaning S; is (A © Gz)-hyperimmune. But 
since k > j, there must exist distinct ig,7; < k with €;, = €;,. So let H = Ge,,. Then 
both $;, and S;, are (A © H)-hyperimmune, which is what was to be shown. Oo 


We are now ready to put everything together to prove Lemma 9.1.13. 


Proof (of Lemma 9.1.13). Fix k = 2, 7 2 1, A € 2%, a sequence So,..., Sx—1 of 
A-hyperimmune sets, and an A-computable coloring c: [w]” — k. For each x and 
i<k,let Rexsi = {y > x: c(x, y) = dj, and let R be the family of all these sets. Then 
R is an A-computable instance of COH. By Lemma 9.1.15, there exists an infinite 
R-cohesive set G such that the sets So,...,S,-1 are (A ® G)-hyperimmune. As in 
Proposition 8.4.17, c [[G]? is a stable coloring of pairs, so we can define c*: G > 2 
by c*(x) = limyeg c(x, y) forall x € G. By Lemma 9.1.17, relative to A ®G, there is 
an infinite homogeneous set H*H for c* and distinct numbers ig, i; < k such that S;, 
and S;, are (A ® G @ H*)-hyperimmune. By definition, H* is limit homogeneous for 
c [[G]*, and so by Proposition 8.4.2 there is an infinite homogeneous set H C H* for 
c |[G]? which is computable from (c |[G]*) @ H* <p cOG@H* <p AGGOH". 
So S;, and S;, are obviously also (A © H)-hyperimmune. Since H is homogeneous 
also for c (not just c |[G]*), the proof is complete. Oo 
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9.1.3 Homogeneity vs. limit homogeneity 


For the final topic of this section, we look at SRT; and its variant D.. We saw in 
Proposition 8.4.8 and Corollary 8.4.11 that SRT3 =, D5 and that RCAg + SRT; o 
D5. With respect to the reducibilities studied in Chapter 4, the next ones to consider 
are <w and <,¢. And as it turns out, these are able to distinguish the two principles. 


Theorem 9.1.18 (Dzhafarov [80]). 


1. SRT dw D*. 
2. SRT, hee D*, 


One way to interpret this is that the proof of Proposition 8.4.2 (the result that every 
limit homogeneous set can be computably thinned to a homogeneous one) cannot be 
made fully uniform, nor can it avoid making essential use of the initial coloring. 


Proof (of Theorem 9.1.18). We begin with part (1). Fix Turing functionals ® and 
W. We build a stable coloring c: [w]* — 2 such that if ®° is a stable coloring 
[w]? > k forsome k > 2, then ® has an infinite limit homogeneous set L such that 
weeL is not homogeneous for c. In fact, it is possible to build c to be computable, 
but we will give a forcing argument, as this makes the argument shorter. 

Conditions are triples (n, 7, f) such that n € w,o isa function [n]* — 2, and f is 
a function n > 2. A condition (n*,o*, f*) extends (n,o, f) ifn* >n,o* D o and 
f* 2 f as functions, and for all x < n and all y withn < y < n*, o*(x,y) = f(x). 
The valuation of (n,o, f) is simply o, as a finite set of codes of ordered triples 
(x, y,i) for x < y andi < 2. Every sufficiently generic filter thus determines a stable 
coloring G: [w]? — 2: for all x, lim, G(x, y) = f(x) for any condition (n, o, f) in 
the filter with x <n. 

Given a condition (n,o, f) andi < 2, let cg fj: [w]? — 2 be the coloring 
defined as follows: for x < y, 


o(x,y) ifx,y <n, 
Sr yay = f@)  ifx<nAn<y, 
i ifn <x,y. 


Our desired coloring c will either be a generic G as above, or else some cg. ;;. 


Case 1: There is a condition (o,n, f), ani < 2, and an infinite low set I © w with 
the following property: there is no finite set F © I and no numbers v > u > n such 
that ¥or.s iF (y) [= Wor.F.i®F (y) |= 1. In this case, take c = cof i. If O° is a 
stable coloring, let L be any infinite limit homogeneous set for it contained in J. 
Then by assumption, ¥°®” cannot be an infinite set. 


Case 2: Otherwise. If there is no condition forcing that ®S is a stable coloring, 
we may take c to be any generic G, and we are done. Thus, let us fix a condition 
(no, 00, fo) forcing that ®% is a stable coloring [w]* — k for some k > 2. Fix 
a C-minimal nonempty set C C {0,1,...,& — 1} with the property that there is 
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a condition (n1,0 1, fi) < (no, 00, fo) and an infinite low set J © w such that 
(n1, 01, fi) forces the following. 


* For each x € J, lim, ®S(x, y) € C. 
* For each j € C there are infinitely many x € J with lim, ®S(x, y) = j. 


(Since J is low, it can be defined and hence talked about in our forcing language.) Say 
C = {ko <-+++km_1}, m > 0. Consider the class C of all sets Z = Zp) ®- ++ ® Zn-] 
as follows. 


¢ For each € < m, there is no finite set F C Ze and no numbers v > u > ny, such 
that Bor 10°F (4) [= Perr f0®F (y) [= 1. 
¢ Zo,..-,Zm-1 partition J. 


Notice that C is a m1 (J) class. We investigate two subcases. 


Subcase 2a: C # ©. Apply the low basis theorem to find an element Z = Z) ®--- ® 
Zx-1 Of C which is low over J, and hence low. Fix € < k such that Z, is infinite. Now 
(n1, 01, fi), 1 = 0, and Ze witness that we are in Case 1, a contradiction. 


Subcase 2b: C = ©. By compactness, there is an s € w such that for any partition 
Zo,..-,Zm_1 of Jthereisaf < manda finite F C Ze | s such that ¥°™ f08F (4) = 
weo.f.0°F (y) |= 1 for some numbers v > u > nj. By our use conventions, this 
also means that u,v < s. Let nz = s, 72 = Coy, 4,0 t [n2]*, and let fo: no > 2 be 
defined as follows: for x < nj, fo(x) = fi (x), and for ny < x < m, fo(x) = 1. 
Note that (2,02, fo) < (m1,01, fi). Let G be the stable coloring determined by 
any sufficiently generic filter containing the condition (n2, 02, f2), and let c = G. 
Then ®° is a stable coloring [w]* — k. By choice of J, the set Le = {x € 1: 
lim, ®°(x, y) = ke} is infinite for each € < m, and Lo,..., Lm-1 partition 7. Since 
Lo ®-::® Ly_-1 ¢ C, we may fix a € < m and a finite F C Le fs such that 
woo f0PF (4) |= Woof" (y) |= 1 for some numbers v, u with nj <u <u <s. 
Since c agrees with c,,, ¢,,9 below s, which also bounds the use of these computations, 
it follows that P°®*(u) |= ¥°®*(v) [= 1. But by definition, c(u,v) = 0 and 
lim, ®°(u) = f2(u) = 1. Let L = FU {x € Le : x > s}. Then L is an infinite limit 
homogeneous set for ®°, and ¥°®", assuming it is the characteristic function of a 
set, contains u and v and so cannot be homogeneous for c. 


This completes the proof of part (1). Part (2) can be proved using the same notion 
of forcing and a similar diagonalization. oO 


9.2 Partial and linear orders 


In this section, we move from colorings to partial and linear orderings. Throughout, 
“partial order” and “linear order” will refer to orderings of subsets of w (or, if we 
are arguing in a model of RCAo, of N). We call such an order finite or infinite if its 
domain is a finite or infinite set, respectively. For completeness, let us mention that 
if (P, <p) is some partial order, we write: x <p yifx <p y andx # y; andx |p y 
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if x <p y and y <p x. We reserve < for the usual linear ordering of the natural 
numbers. Similarly, “least” and “greatest” will always be meant with respect to <. 
For general partial orders <p, we will say “<p-least” and “<p-greatest’, etc. 

Recall that if (P, <p) is a partial order, then a subset S of P is a chain if for 
all x,y € S, either x <p y or y <p x, and it is an antichain if for all x,y € S, 
x |p y. The principles we study here are motivated by a combinatorial result known 
as Dilworth’s theorem. This states that in a finite partial order (P, <p), the size of 
the largest antichain is equal to the size of the smallest partition of P into chains. 
One version of this for the countably infinite setting is the following. 


Definition 9.2.1 (Chain/antichain principle). CAC is the following statement: every 
infinite partial order (P, <p) has an infinite chain or antichain. 


A telated version for linear orders can be obtained using the following definition. 
Definition 9.2.2. Let (L, <) be a linear order. A subset S of L is: 


1. an ascending sequence if for allx, y € S,ifx < y thenx < , y, 
2. a descending sequence if for all x, y € S, if x < y then y <y x. 


In the parlance of order theory, an ascending sequence is in particular a suborder of 
order type (i.e., isomorphic to) w, while a descending sequence is a suborder of order 
type w* (the reverse of the natural ordering on w). One point of caution, however, is 
that these are not equivalent. A suborder of type w, for example, is simply a subset 
P* of P such that every x € P* has only finitely many <p-predecessors in P*. But 
there is no reason the ordering needs to also respect the natural ordering. We will 
return to this somewhat subtle distinction at the end of this chapter. 


Definition 9.2.3 (Ascending/descending sequence principle). ADS is the following 
statement: every infinite linear order has an infinite ascending sequence or descending 
sequence. 


The computability theoretic content of partial and linear orders has been studied 
extensively since at least the 1970s and 1980s. In reverse mathematics specifically, it 
began with the principles CAC and ADS. These were formulated by Hirschfeldt and 
Shore [152] in their seminal paper on combinatorial principles weaker than Ramsey’s 
theorem for pairs. Notably, except for Ramsey’s theorem and logical principles like 
induction, bounding, etc., these were two of the very first principles found to lie 
outside the “big five” classification. 


9.2.1 Equivalences and bounds 


We begin with some basic relationships between CAC, ADS, and Ramsey’s the- 
orem. First, notice that a partial order (P, <p) naturally induces a 2-coloring of 
[P]?. Namely, one color can be used for <p-comparable pairs, another for <p- 
incomparable pairs. A homogeneous set is then obviously either a chain or an an- 
tichain. This witnesses that CAC is identity reducible to RT3 and also the following: 
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Proposition 9.2.4 (Hirschfeldt and Shore [152]). RCAg + RT; — CAC. 


A linear order (L, <z) also naturally induces a 2-coloring of [P]*, and this can 
be used to show that ADS is a consequence of RTS much as above. We will return to 
this shortly. However, ADS is also a consequence of CAC. 


Proposition 9.2.5 (Hirschfeldt and Shore [152]). RCAg + CAC > ADS. 


Proof. We prove that ADS is identity reducible to CAC. The result is just a formal- 
ization of this argument. Fix a linear order (L, <,). Define a partial ordering <p of 
L as follows: for x < y in L, set x <p yif x <z y, and set x |p yif y <, x. Thus, 
<p tests whether or not x and y are ordered the same or opposite way as they under 
the natural ordering. Clearly, <p is computable from <_. It is easy to check that <p 
is indeed a partial order. Reflexivity and antisymmetry are obvious. For transitivity, 
if x <p y and y <p z then it must be that x < y < zandx <;, y <, z, so also 
x < zand x <; z, and therefore x <p z. Now, let S = {xp < x1 <---} C L bean 
infinite chain or antichain for <p. If S is a chain then we must have x9 <p x; <p--- 


and therefore also x9 <p x1 <z ---. Thus, S is an ascending sequence for <_. If § 
is an antichain, then we must have x9 >, x1 >z --:, So S is an infinite descending 
sequence for <;. oO 


Testing whether a pair of numbers is or is not ordered the same way as under the 
natural ordering turns out to be very useful, and is encountered repeatedly in the 
context of CAC and ADS. We could use it, for example, to give an alternative proof of 
Proposition 9.2.4. Namely, given a partial order (P, <p), define a 3-coloring c(p,< 
of [P]? as follows: for x < y in P, let 


P) 


0 ify <p x, 
C(P,<p) (X, y) =41 if x <p y, 
2 ifx|py. 


An infinite homogeneous set for c(p,<,) is again a CAC-solution to <p. And if we 
instead start with a linear order (L, <z), then c(z,<,) will be a 2-coloring (the color 
2 will never be used), and an infinite homogeneous set for c(z,<,) will then be an 
ADS-solution to <;. 

It turns out that such colorings can be used to completely characterize CAC and 
ADS in terms of restrictions of Ramsey’s theorem for pairs. The key observation is 
that these colorings behave like partial/linear orders, in the following sense. 


Definition 9.2.6. Fix k > 2 anda coloring c: [w]? > k. 


1. c is semi-transitive if for all but at most one i < k, if c(x, y) = c(y, z) =i then 
c(x, z) =i. 
2. c is transitive if for alli < k, if c(x, y) = c(y, z) =i then c(x, z) =i. 


If (P, <p) is a partial order, then the coloring c(p,<,) is easily seen to be semi- 
transitive (the only color i < 3 that need not satisfy the transitive property is i = 2). 
So if (L, <z) is a linear order, the coloring c(z,<,) is transitive. 
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Proposition 9.2.7 (Hirschfeldt and Shore [152])._ The following are provable in 
RCAo. 


1. CAC is equivalent to the restriction of RT? to semi-transitive colorings. 
2. ADS is equivalent to the restriction of RT; to transitive colorings. 


Proof. The proof of (1) is divided into two parts: first, that the restriction of RTS 
to semi-transitive colorings is identity reducible to CAC; and second, that CAC 
is identity reducible to the restriction of Bla to semi-transitive colorings. It is 
straightforward to formalize these arguments over RCAo, and equally straightforward 
(using basically the same proof as Corollary 4.6.12) to show that for all k > 2, 
the restrictions of RTS and RT; to semi-transitive colorings are equivalent over 
RCAp. To prove the reductions, first fix a semi-transitive coloring c: [w]* — 2. Say 
c(x,y) = c(y,z) = 1 — c(4,z) = I, for all x < y < z. Define a c-computable 
partial ordering <p of w as follows: for x < y, set x <p y if c(x, y) = 1. Now every 
infinite chain for <p is an infinite homogeneous set for c with color 1, and every 
infinite antichain for <p is an infinite homogeneous set for c with color 0. Next, let 
(P, <p) be an instance of CAC. The coloring c(p,<p) : [P]? — 3 defined above is 
computable from (P, <p). And as we noted, it is semi-transitive, and every infinite 
homogeneous set for it is an infinite chain or antichain for <p. 

Now, to prove (2), we prove that each of ADS and the restriction of RT to transitive 
colorings is identity reducible to the other. Fix a transitive coloring c: [w]* > 2. 
We construct a c-computable linear ordering <; by stages. At stage s € w, we define 
<,_ onw | s+1. Thus, at stage 0, we just define 0 <;, 0. Now take s > 0 and assume 
that <; has been defined on w | s. If there is an x < s such that c(x, s) = 1, choose 
the <,-largest such x and define y <, s for all y <, x. Define s <y y for all other 
y < s. If no such x exists, meaning c(x, 5) = 0 for all x < s, then define s <y, x for 
all x < s. In either case, define s <, s. This completes the construction. 

It is clear from that <z is indeed a linear order. (New elements are always 
added to the linear order either above all previously added elements, below all 
previously added elements, or between two previously ordered elements.) Consider 
any ADS-solution S for <z, and suppose first that S is ascending. We claim that S is 
homogeneous for c with color 1. If not, fix the least y € S such that there is a w in $ 
with y < w(and hence also y <, w) and c(y, w) = 0. Let s be the least such w for our 
fixed y. For us to have ordered y <;,, s at stage s of the construction, there must be an 
x < ssuch that y <; x and c(x, s) = 1. Obviously, y # x. If x < y then for us to have 
ordered y <z x we must have c(x, y) = 0. But then since c(y, s) = 0, transitivity 
of c implies c(x, s) = 0, a contradiction. So it must be that y < x. If c(y,x) = 1 
then since c(x, s) = 1, transitivity of c implies c(y, s) = 1, a contradiction. Thus, 
we must have c(y, x) = 0. But now x can serve as a smaller witness w above than s, 
contradicting the choice of s. 

To complete the proof, suppose S is descending. Then it follows readily from 
the construction that S is homogeneous for c with color 0. For the converse, let 
(L, <z) be an instance of ADS. Then the coloring c(z,<,): [L]* — 2 is computable 
from (L, <,) and every infinite homogeneous set for it is an infinite ascending or 
descending sequence for <;. oO 
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A basic question now is how “close” ADS and CAC are to each other, and of 
course, to aire For starters, these principles are not computably true. In the case 
of CAC, Herrmann [142] showed that there is a computable partial ordering of 
w with no infinite xo chain or antichain. In the case of ADS, Tennenbaum (see 
Rosenstein [260]) and Denisov (see Goncharov and Nurtazin [127]) independently 
constructed a computable linear order with no computable suborder of order type w 
or w*. In our parlance, this implies there is a computable instance of each of CAC and 
ADS with no computable solution. The following provides a stronger lower bound. 


Theorem 9.2.8 (Hirschfeldt and Shore [152]). RCAj + ADS — COH. 


Proof. Arguing in RCAg, fix R= (R; : i € w), an instance of COH. For each x € N, 
let 7, € 2<N be the string of length x defined by: o,(i) = R;(x) for all i < x. (So 
the set of a exists, as does the map x + ox.) Define a linear ordering <,, of N as 
follows: x <j, y if and only if a lexicographically precedes oy, i.e., 7 < Oy or 


(ai < x)(Wj <i) [ox(J) = oy(j) A ox(i) < oy(i)]. 


Clearly, <j; exists and as such is an instance of ADS. Fix any ADS-solution, S, to this 
instance, and suppose first that it is ascending. We claim that S is R-cohesive. Fix i 
and consider the set 


M = {o € 2"! : (Ay € S)[o lexicographically precedes oy] }. 


This is =) -definable and bounded, and so exists by Iz? Hence, we can fix the 
lexicographically largest o € M, say with witness y € S. Since S is ascending, oy, 
and therefore also o, must lexicographically precede o, for allx > yin S. Then for all 
sufficiently large x € S, we necessarily have 7 < 0, as otherwise 0 [i+1 would be 
an element of M (with witness x) lexicographically larger than o. Hence, for all such 
x, we have by definition that R;(x) = o,(i) = (i), so either S C* R; or S C* Rj. 
Since i was arbitrary, we conclude that S is R-cohesive. This completes the proof in 
the case that S' is ascending. If S is descending, we interchange “lexicographically 
precedes” with “lexicographically succeeds” and “lexicographically largest” with 
“lexicographically least’, and then the proof is the same. oO 


Note that, in terms of finer reducibilities, what we have actually shown is that COH 
is identity reducible to ADS. 


9.2.2 Stable partial and linear orders 


As with Ramsey’s theorem for pairs, we can try to understand CAC and ADS better 
by dividing each into simpler forms, analogously to the Cholak—Jockusch—Slaman 
decomposition. We have already remarked that this idea has proved fruitful for many 
principles, but here is our first example of how this is done outside of Ramsey’s 
theorem proper. The starting point is an analogue of “limit color” in the setting of 
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partial/linear orders. For a partial order (P, <p), recall the induced coloring c(p,<p) 
discussed in the previous section. If this coloring is stable, then considering its limit 
colors leads to the following definitions. 


Definition 9.2.9. Let (P, <p) be an infinite partial order. Then x € P is: 


1. <p-small if x <p y for almost all y € P, 
2. <p-large if y <p x for almost all y € P, 
3. <p-isolated if x |p y for almost all y € P. 


Given S C P, say x € S is <p-small, <p-large, or <p-isolated in S if x <p y, 
y <p x, orx |p y, respectively, for almost all y € S. 


When <p is fixed, we may call x as above just small, large, or isolated. 

The obvious idea is to call a partial or linear order stable just in case every element 
is small, large, or isolated (i.e., precisely if the induced coloring above is stable). For 
partial orders, we also consider an alternative, stronger notion of stability. 


Definition 9.2.10 (Stable partial and linear orders). 


1. An infinite partial order is weakly stable if every element is small, large, or 
isolated. 

2. An infinite partial order is stable if either every element is small or isolated, or 
every element is large or isolated. 

3. An infinite linear order is stable if every element is either small or large. 

4. We define the following principles: 


* WSCAC is the restriction of CAC to weakly stable partial orders. 
¢ SCAC is the restriction of CAC to stable partial orders. 
¢ SADS is the restriction of ADS to stable linear orders. 


Therefore, a stable linear order is one of order type w + w*, as defined in Section 6.5. 

Unfortunately, the above terminology can be a bit confusing. A stable linear order 
is necessarily weakly stable as a partial order, but it is not necessarily stable as a 
partial order. Aesthetically, it would maybe make more sense to re-name “weakly 
stable” by “stable”, and “stable” by “strongly stable”. But the above terminology 
is well established in the literature, and actually, for most intents and purposes the 
distinction is inconsequential. Most importantly, we have the following theorem due 
to Jockusch, Kastermans, Lempp, Lerman, and Solomon [171]. 


Theorem 9.2.11. RCAy + SCAC <> WSCAC. 


Proof. Obviously, WSCAC — SCAC (every stable partial order is weakly stable.) To 
show that SCAC — WSCAC, we argue in RCAg. Let (P, <p) be an infinite weakly 
stable partial order. Define a new partial ordering <j, of P as follows: for all x, y in 
P, set x <5, y if and only if x <p y andx < y. 

It is easy to check that this is indeed a partial order. Note that if x € P is <p-small 
then it is also <p-small. And if x is <p-large or <p-isolated, then it is <>p-isolated. 
Thus, by weak stability of <p, every x € P is either <%,-small or <>p-isolated, so 
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(P, <>) is stable. Apply SCAC to find an infinite § C P which is either a chain or 
antichain for <>: In the former case, S is also clearly a chain for <p. In the latter 
case, S must consist entirely of <p-large or <p-isolated elements. Hence, (S, <p) is 
a stable partial order. Apply SCAC to find an infinite $* ¢ S C P which is either a 
chain or antichain for <p, and the proof is complete. oO 


For now, then, we will stick to SCAC, as it tends to be slightly easier to work with. 
But we will revisit WSCAC, and its relationship to SCAC, at the end of this chapter, 
when we consider finer reducibilities. 


Proposition 9.2.12 (Hirschfeldt and Shore [152]). The following are provable in 
RCAo. 


1. SCAC is equivalent to the restriction of SRT5 to semi-transitive colorings. 
2. SADS is equivalent to the restriction of SRT; to transitive colorings. 
3. RCAg + SRT} > SCAC — SADS — BY’. 


Proof. For (1) and (2), the proof is identical to that of Proposition 9.2.7. All that is 
needed is to observe that if (P, <p) is a stable partial order then the induced coloring 
C(P,<p) 18 a stable. Our (stronger) notion of stability also implies that c(p,<,) is a 2- 
coloring (in fact, it uses only the colors 0 and 2 or | and 2). Part (3) is obvious except 
for the last part, that SADS — Be This proceeds by showing that SADS implies 
the principle PART from Definition 6.5.7; the details are left to the exercises. oO 


Like SRT3, each of SCAC and SADS admits AS solutions. This follows from parts 
(1) and (2) above, and noting that the equivalences there are actually computable 
reductions (in fact, identity reductions). But it can also be seen directly. First, note 
that given a stable partial or linear order, determining whether an element of the 
domain is small, large, or (if the order is partial) isolated, is computable in the jump 
of the order. This is just like determining an element’s limit color under a stable 
coloring of pairs (see Remark 8.4.3). 

Hence, the jump can compute an analogue of an infinite limit homogeneous set, 
i.e., an infinite set of elements, all of which are small, or all of which are large, or all 
of which are isolated. Just like in Proposition 8.4.2, this limit homogeneous set can 
then be computably thinned to an actual solution for the order—a chain or antichain 
in the case of SCAC, and an ascending or descending sequence in the case of SADS. 
See Exercise 9.13.2 to work this out in detail. 

To obtain decompositions, we now need to formulate suitable analogues of COH. 
It is not necessarily obvious how to do this, since COH is not prima facie any kind 
of restriction of Ramsey’s theorem. However, we can formulate these analogues in a 
canonical way (if not particular interesting, from a combinatorial perspective), as in 
the equivalence given by Exercise 8.10.7. We can simply let the “cohesive version” 
be the statement that each instance is stable on some infinite subdomain. As it turns 
out, this is not a new principle in the case of CAC. 
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Proposition 9.2.13 (Hirschfeldt and Shore [152])._ Over RCAo, ADS is equivalent 
to the statement that for every infinite partial order (P, <p) there is an infinite set 
S ¢ P such that (S, <p) is stable. 


Proof. We argue in RCAo. Let an infinite partial order (P, <p) be given. By Exer- 
cise 9.13.1, there exists a linear order <;, of P extending <p (sox <p y > x <1 y, 
for all x, y € P). By ADS, we may fix an infinite ascending or descending sequence 
S C P for <,. Now define a family R = {Ry : x € w} of subsets of S by setting, for 
eachx € S, 

Rx ={y €S:x<yAx |p y}. 


Since ADS implies COH, we may invoke Exercise 8.10.5 to find an infinite R- 
cohesive set X C S. Let <x be the restriction of <p to X. Then (X, <x) is stable. 
Indeed, consider any x € X. If X C* R, then x |p y for almost all y € X. Hence, 
x iS <x-isolated. On the other hand, suppose X C* R,, which means x <p y or 
y <p x for almost all y. If S is ascending then x <z y forall y > x in S, so since <, 
extends <p, we must have x <p y for almost all y € X. Thus, x is <x-small. And 
if S is descending, then we analogously get that x is <x-large. Since this depends 
only on S and not on x, we conclude that either every element of X is <y-isolated 
or <x-small, or every element of X is <y-isolated of <x-large. oO 


For ADS, we formulate a separate principle. 


Definition 9.2.14. CADS is the following statement: for every infinite linear order 
(L, <x) there is an infinite set S C L such that (S, <;) is stable. 


The following proposition summarizes the basic relationships between the “cohesive 
versions’. Of note is that by parts (3) and (4), CADS and COH are equivalent modulo 
BES (and hence, each is also equivalent in the same way to the principle CRT 
defined in Exercise 8.10.7.) 


Proposition 9.2.15 (Hirschfeldt and Shore [152]). 


I, RCAg t CAC @ SCAC + ADS. 
2. RCAg + ADS © SADS + CADS. 
3. RCAy + ADS > COH = CADS. 
4, RCAy + BZ) + CADS > COH. 


Proof. Parts (1) and (2) are immediate by Proposition 9.2.13 and the definitions. The 
first implication in part (3) is Theorem 9.2.8. The proof of the second implication is 
similar to that of Proposition 9.2.13. 

__ Finally, for (4), we proceed as in the proof of Theorem 9.2.8. Given an instance 
R = (R; : i € w), define a linear order <,; as there. Applying CADS (instead of ADS, 
this time), we get an infinite set S such that <; is stable on S. If there are only finitely 
many <;-small or finitely many <,-large elements in S, then using Bx? we can thin 
S to an infinite ascending or descending sequence for <z,. (See Exercise 9.13.2.) We 
can then argue as in Theorem 9.2.8 that this is cohesive for R. 
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So assume there are infinitely many <,-small and <,-large elements in S. By 
bee S has a <,-least element, 07, as well as a <,-largest element, 17. Now fix /, 
and let 

M={o€2"!: (aye S)[o = oy Pit 1]f. 


(Recall that oy is the string of length y such that o,,(i) = R;(y) for alli < y.) The set 
M exists by bounded x? comprehension. Say M has b many elements. So b < 2'*!, 
but also b > 0: since S is infinite, we may take any y > iin S, and thena = oy [i+1 
belongs to M. For each o € M, let y be the least y € S such that o = oy fi+1, 
which also exists by [=?. Let yo <z +++ <z yp-y list all elements S of the form y¢ 
for o € M. For completeness, let y_; = Oz; and y, = 1,. (Note that it could be that 
y-1 = Yo OF Yp = Yp-1-) 

We next define a coloring c: N > b + 1 as follows: given x € N, let c(x) be the 
least j < b+ 1 such that x <y y;. Since x <z yp, the existence of c(x) follows by 
Ix? so c is well-defined. By RT! (which follows from Bx) we may fix 7 < b+1 
such that c(x) = j for infinitely many x € N. By definition of c, all such x satisfy 
yj-1 <~ ¥ <z y;. Hence, y;-; is <,-small in S and y; is <;,-large in S$, and we 
actually have that c(x) = j for almost all x € S. In particular, almost all x € S satisfy 
x >itand y;-| <_ x <y y;. 

We claim that for any such x, either vy, ; < 7 Or 7 < oy,. Indeed, by definition 
of <z, oy,_, lexicographically precedes o,, which lexicographically precedes oy,. 
And since x > i, 7 = ox, fit1¢€M.So if oy, # ox, and o, # Oy; then we 
would have that y;-1; <_ yo <x y;. But this is fmmposeible by how the numbering 
yo,--+s Yb was chosen. So the claim holds. 

Now by the preceding claim, either vy, , < ox for almost all x € S, or 7% < oy, 
for almost all x € S. In the former case, we have that R;(x) = ox (i) = oj-1(i) for 
almost all x € S, and in the latter we have that R;(x) = o.(i) = o;(i) for almost all 
XE S. Thus, either S C* R; or S C* R;. Since i was arbitrary, we conclude that S is 
R-cohesive. oO 


9.2.3 Separations over RCAg 


Having established implications between CAC, ADS, and their stable and cohesive 
variants, let us turn to nonimplications. To begin, we have the following. 


Proposition 9.2.16 (Hirschfeldt and Shore [152]). RCAg X COH — SADS. 


Proof. On the one hand, SADS — Bx} by Proposition 9.2.12. On the other, COH 
is TI! conservative over RCAg by Theorem 7.7.1 and therefore does not imply Bx?. 
Hence, COH does not imply SADS over RCAo. oO 


In fact, Wang [322], has established also the stronger fact that SADS ¢,, COH. The 
proof, which we omit, uses a preservation property based on definability (see Defini- 
tion 9.5.5). We will see a somewhat similar idea used in the proof of Theorem 9.2.22 
below. 
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Another pair of principles we can separate immediately is SADS and WKL. Of 
course, SADS does not imply WKL over RCAp since even RT; does not. The failure 
of the converse follows from the fact that SADS implies BES while, by Harrington’s 
theorem (Theorem 7.7.3), WKL is II, conservative over RCAp. We can improve this 
with an w-model separation. 


Proposition 9.2.17 (Csima and Mileti [59]).. SADS omits hyperimmune free solu- 
tions. 


Proof. The aforementioned result of Tennenbaum and Denisov, that there is a com- 
putable linear order with no computable suborder of order type w or w*, is actually of 
order type w+ w*. It is easy to computably thin the domain of this order to produce a 
computable instance (L, <_) of SADS with no computable solution. We claim that, 
in fact, every solution to (L, <,) is hyperimmune. Suppose not. Let S ¢ L be any 
solution, and let f be a computable function dominating ps. Then for all x we have 
Ds(x) < f(x) < ps(f(x)), so in particular, f(x) is <,-small or <,-large depending 
as S is ascending or descending. But then, using /, we can find an infinite computable 
set either of <,-small or <,-large elements, and this can then be computably thinned 
using Exercise 9.13.2 to a computable solution to (L, <z). Oo 


Corollary 9.2.18. SADS ¢., WKL. Therefore, WKL and SADS are incomparable 
over RCAo. 


Proof. By Exercise 4.8.13, there is an w-model of WKL consisting entirely of sets 
that have hyperimmune free degree. By the preceding proposition, SADS is false in 
this model. Oo 


Although each of SRT3, SCAC, and SADS admit AY solutions (as noted above), 
a surprising point of difference is that SCAC and SADS actually also admit low 
solutions. Recall that, by Theorem 8.8.4, this is not the case for SRT}. 


Theorem 9.2.19 (Hirschfeldt and Shore [152]).. SCAC admits low solutions. 


Proof. We prove the result for computable stable partial orders. The full result 
follows by relativization. So suppose (P, <p) is computable and stable, say with 
every element small or isolated. (The case where every element is large or isolated 
is symmetric.) If there is a computable infinite antichain for <p then we are done, 
so suppose otherwise. Call a string a € w*® a precondition if the following hold. 


¢ Foralli < j < |a|, a(i) < a(/). 
° For alli < j < |a|, a(i) <p a(/). 


Now, consider the notion of forcing whose conditions are strings a € P< such that 
qa is a precondition and also a(|a| — 1) is <p-small (and hence so is a@(i) for all 
i < |a|). Extension is as usual for strings, and the valuation of a condition a is simply 
its range as a subset of w. (Note that this is a @’-computable forcing.) A sufficiently 
generic set G is thus a chain for <p (in fact, an ascending sequence). Clearly, for 
each n, the set of conditions a with |a| > n is uniformly (P, <p)-effectively dense, 
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as otherwise almost every element of P would be <p-isolated, which would then 
contradict our assumption by Exercise 9.13.2. We next argue that for each e € w, 
the set of conditions that decide whether e € G’ is uniformly @’-effectively dense. 
To see this, fix e and a condition a. We search for a condition a* > a such that one 
of the following holds. 


1. © (e) |. 
2. oF (e) 7 for every precondition 6 > a. 


Since the set of preconditions (as opposed to conditions) is computable, the search 
can be performed using @’. We claim it must succeed. Assume not. For every 
condition a* > a there is a precondition B > a* such that of (e) |, else (2) would 
hold. Moreover, no such 6 can be a condition, else (1) would hold, so B(|B| - 1) 
must be <p-isolated. Let S* be the set of all numbers of the form 6(|f| — 1) for Ba 
precondition extending a with of (e) |. Then S* is an infinite c.e. set of <p-isolated 
elements, and hence has an infinite computable subset S. Now by Exercise 9.13.2, 
there is an infinite computable antichain for <p contained in $, which contradicts 
our assumption. So the claim holds, and we must find the desired a*. If (1) holds, 
then a* It e € G’. If (2) holds, then a* tt e ¢ G, as every conditions is a precondition. 
This proves that the set of conditions that decide whether e € G’ is uniformly @’- 
effectively dense. As usual, we can now take a generic G according to Theorem 7.5.6 
and conclude that G’ <p @’. o 


Corollary 9.2.20. SRT; €.. SCAC. Hence also RCAy ¥ SCAC > SRT3. 


Next, recall the problem DNR, which we defined in Definition 4.3.9 and saw again 
in the proof of Theorem 8.8.2. 


Theorem 9.2.21 (Hirschfeldt and Shore [152]). DNR <,, CAC. Hence also 
RCAo # CAC — DNR. 


Proof. Let C be the class of all sets Y such that Y computes no DNC function. 
Clearly, C is closed downward under <7. We show that CAC admits preservation of 
C. Of course, by definition, DNR does not admit preservation of C (with witness 2). 
The result then follows by Theorem 4.6.13. 

We begin by showing that SCAC admits preservation of C. That is, given a set 
A computing no DNC function, and an A-computable instance of SCAC, ie., an 
A-computable infinite stable partial order (P, <p), we exhibit an infinite chain or 
antichain for (P, <p) whose join with A computes no DNC function. As usual, we 
take A = @ for ease of notation, with the full result following by relativization. First, 
if <p has an infinite computable antichain, then we are done, so assume not. In 
particular, there are infinitely many <p-small elements in P. Say every element of 
P is either <p-small or <p-isolated, the other case being symmetric. Consider the 
notion of forcing from Theorem 9.2.19, as well as the notion of precondition defined 
there. We claim that for every Turing functional I, the set of conditions a forcing, 
for some e, either that [°(e) f or that F(e) |= ®,.(e) |, is dense. Any sufficiently 
generic G such that F@ is total will consequently satisfy that there is an e such that 
Vr (e) |= ®.(e) |, so V will not be a DNC function. 
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To prove the claim, fix a Turing functional land a condition a. If there is an e and 
a condition a* > a with T® (e) |= ®,(e) |, then a* forces P(e) |= ®.(e) |. If, 
instead, there is an e and acondition a* > a such that '*(e) 7 for every precondition 
B = a*, then a* forces [S(e) T (since every condition is a precondition). So assume 
neither possibility holds; we derive a contradiction. By assumption, for every e there 
is a precondition 8 > a such that P*(e) |. Moreover, if ®,(e) | andl’ (e) = ®,(e), 
then 6 cannot be a condition, so 6(|8| — 1) must be <p-isolated. 

Now, if there were only finitely many e such that [’(e) |= ®,(e) | for some 
precondition 6 > a, then we could compute a function f with f(e) # ®-(e) for 
almost all e, as follows: given e, search for the least precondition 6 > a such that 
T8(e) | and set f(e) = 9 (e). Since the set of preconditions is computable, so would 
be f. But then a finite modification of f would be a computable DNC function, which 
is impossible. So there must be infinitely many e such that P’(e) |= ®,(e) | for 
some precondition 8 > a. And by our use conventions, we must have || > e if 
P(e) |. It follows that there is an infinite computable set S C P of numbers of the 
form B(|6| — 1) for some precondition B > a with T’(e) |= ®,(e) |. But then S$ 
is an infinite computable set of <p-isolated elements of P, which can be thinned 
to an infinite computable antichain for <p by Exercise 9.13.2. This contradicts our 
assumption that <p has no such antichain. This completes the proof that SCAC 
admits preservation of C. 

Now, since SADS <,, SCAC, it follows that SADS admits preservation of C. 
We also recall, from the proof of Theorem 8.8.2, that COH admits preservation 
of C. Hence, by Exercise 4.8.12, SADS + COH admits preservation of C. Since 
SADS+COH =,, ADS by Proposition 9.2.15, it follows that ADS admits preservation 
of C, and hence so does SCAC + ADS. And since CAC =,, SCAC + ADS, also by 
Proposition 9.2.15, we conclude at last that CAC admits preservation of C. oO 


The final separation we consider in this section is between CAC and ADS. This 
was left open by Hirschfeldt and Shore [152], and formed a major question in reverse 
mathematics for a number of years. The eventual resolution actually established the 
following stronger fact. 


Theorem 9.2.22 (Lerman, Solomon, and Towsner [195]). SCAC ¢,, ADS. Hence 
also RCAg ¥ ADS — SCAC. 


The proof in [195] is a two-part forcing argument. A ground forcing is used to create 
a suitably complicated instance of SCAC. Then, an iterated forcing is used to add 
solutions to instances of ADS that do not compute solutions to this SCAC instance. 
(A nice introduction to this technique for more general applications is given by 
Patey [245].) Here, we give a somewhat simplified proof due to Patey [243] which 
uses preservation properties in the sense of Definition 3.6.11. We will need the 
following combinatorial notion. 


Definition 9.2.23 (Patey [243]). 


1. A formula y(X, Y) of sets is essential if for every n € w there is a finite E > n 
such that for every m € w, y(E, F) holds for some finite F > m. 
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2. Fix A, Po, P; € 2%. Then Po, P; are dependently A-hyperimmune if for every 
= essential formula (X,Y), y(Fo, F1) holds for some finite sets Fy C Po 
and F, € P}. 

3. A problem P admits preservation of dependent hyperimmunity if for every 
A € 2”, and all dependently A-hyperimmune sets Po, P}, every A-computable 
instance X of P has a solution Y such that Po, P; are dependently (A @ Y)- 
hyperimmune. 


Clearly, if Po,P; are dependently A-hyperimmune, they are dependently A*- 
hyperimmune for all A* <7 A. So the class Cp,,p, of all A such that Po, P; are 
dependently A-hyperimmune is closed downwards under <r, and a problem admits 
preservation of dependent hyperimmunity if and only if, for all Po, P; € 2%, it 
admits preservation of Cp,,p,. 

The two main facts from which the theorem follows are the following. 


Lemma 9.2.24 (Patey [243]). ADS admits preservation of dependent hyperimmu- 
nity. 


Lemma 9.2.25 (Patey [243]). SCAC does not admit preservation of dependent 
hyperimmunity. 


Proof (of Theorem 9.2.22; Patey [243]). Fix Po, P; € 2% such that SADS does not 
admit preservation of Cp, p,. Since ADS admits preservation of dependent hyper- 
immunity, it admits preservation of Cp,,p,. From here, the conclusion follows by 
Theorem 4.6.13. oO 


Let us now prove Lemmas 9.2.24 and 9.2.25, beginning with the former. 


Proof (of Lemma 9.2.24). As usual, we prove the result in the unrelativized setting 
for simplicity. Let (L, <_) be a computable linear order, and let Po, P; be a pair of 
dependently @-hyperimmune sets. We must show that there is an infinite ascending 
or descending sequence G for <z such that Po, P; are dependently G-hyperimmune. 
First, suppose there is an infinite set 7, whose elements are either all <,-small in 
T or all <,-large in J, and such that Po, P; are dependently 7-hyperimmune. By 
Exercise 9.13.2, J can be computably thinned to a solution S for (L, <,), and Po, P| 
are dependently S-hyperimmune since S <r J. In this case, we can thus take G = S. 

Assume next that no J as above exists. We force with 2-fold Mathias conditions 
(Eo, E,, 1) satisfying the following additional properties. 


e Eo is <,-ascending and x <, y forall x € Ep andy e€ J. 
e FE) is <,-descending and x >, y forallx € FE; andy €/. 
¢ x <, y forall x € Ep andall y € E). 

¢ Po, Pi are dependently /-hyperimmune. 


(Trivially, (9, @,w) is a condition.) A sufficiently generic filter yields objects Go 
and G1, where Go is a <,-ascending sequence and G is a <,-descending sequence 
(and x <z y for all x € Go and all y € G}). It is easy to see, from our assumption, 
that both Go and G, are infinite sets. We claim that for some i < 2, Po, P; are 
dependently G;-hyperimmune, so we can then take G = G; to complete the proof. 
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Let Gp and G; be names for Go and G, in the forcing language. The way we prove 
the claim is to show, for every pair yo(Go, X,Y) and wy; (G1, X,Y) of ba formulas of 
£2(G), that the set of conditions forcing the following is dense: for some i < 2, if 
yi (G;, X, Y) is essential, then 


(A finite Fy C Po)(A finite F; © P1)[ ye, (G;, Fo, Fi)]. (9.1) 


The claim then follows by genericity and Lachlan’s disjunction. 

So fix yo(Go, X, Y) and y)(G), X, Y) as above and let (£9, £1, /) be any condition. 
By passing to an extension if necessary, we can assume (£9, £1, /) forces that both 
yo(Go, X,Y) and y;(G;, X, Y) are essential (otherwise we are done). Let W(X, Y) be 
the formula asserting there are finite sets Dp, D; € J such that the following hold. 


¢ Do is <p-ascending. 

e D, is <_-descending. 

¢ x <, y forall x € Do andall y € D,. 

* for each i < 2, there exist finite sets F;,9 G X and F;,; € Y such that ye, (E; U 
Di, Fi0, Fit) holds. 


Then yw is ya and we claim it is essential. For each n € w we exhibit a finite set Fo 
such that foreach m € w there is a finite set F| for which (Fo, F|) holds. Fix n. Take 
a generic filter containing (Eo, £1, /), and let Go a G, be the objects determined 
by it. By genericity, yo(Go, X, Y) and yo(G1, X, Y) are essential. Hence, we can fix 
finite sets Fo,9, Fi,9 > n such that for every m, there exist finite sets Fo.1, Fi,1 > y 
for which yo(Go, Fo,o, Fo,1) and yo(G1, F1,0, Fi,1) hold. Let Fo = Fo,o U Fi,0. Now 
fix m € w. By choice of Foo and F),9, we can fix finite sets Fo), F1,; > m such 
that yo(Go, Fo,0; Fo.1) and Yyo(Gi, Fi, Fi1) hold. Let F, = Fi U Fy. Now since 
yo and y are existential formulas, there is a k € w such that yo(Go [ k, Fo,o, Fo,1) 
and yo0(G1 [ k, Fi.0, Fi,1) hold. Without loss of generality, k > Eo, E;. Hence, for 
eachi < k, G; }k = E; U D; for some D; C I. Since Go is <_-ascending, so is Do. 
Similarly, D; is <_-descending. And since x <; y for all x € Go and all y € Dj, 
the same is true of all x € Do and all y € D,. Thus, w(Fo, F;) holds, as claimed. 
Since Po, P; are dependently /-hyperimmune, it follows that there exists Fy G Po 
and F, € P; for which w( Fo, F;) holds. Fix the witnessing sets Do, D; and, for each 
i < 2, F;,9 and F;,1. Then Fj,9 Po and Fi CG P|. If there are infinitely many y € J 
such that x <, y for all x € Do (or equivalently, for the <,-largest element of Do), 
then set FE} = Eo U Do, E} = Ey, and I* = {y € 1: y > Do A (Vx € Do) [x <z YI}. 
Otherwise, since x <, y for all x € Do and all y € Dj, it follows that there are 
infinitely many y € 7 such that x >, y for all x € Dj. In this case, set ES = Eo, 
EY = £, UD, and I* = {ye 1: y > Di A (Wx € Di)[x 21 y)}. In either case, 
(£5, Ej, 1") is an extension of (Eo, Ej, /) forcing (9.1). oO 


Proof (of Lemma 9.2.25). Recall by Proposition 9.2.12 that SCAC is w-model equiv- 
alent to the restriction of SRT3 to semi-transitive colorings. We show that the 
latter problem does not admit preservation of dependent hyperimmunities. To 
this end, we show that there exists a computable stable semi-transitive color- 


ing c: [w]* — 2 such that the sets Po = {x € w: limy c(x,y) = O} and 
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P, = {x € w: limyc(x,y) = 1} are dependently @-hyperimmune. To see that 
this suffices, suppose H is any infinite homogeneous set for c. Let p(X, Y) be the 
a formula X #ZAY#@QAX CHAY CH. Since H is infinite, ¢ is easily 
seen to be essential. But clearly, there do not exist Fo € Po and F; € P; such that 
y(Fo, F\) holds, since either HM Po = @ or HM P, = @. Hence, Po, P| are not 
dependently H-hyperimmune. 

The construction is a finite injury argument. Fix an effective enumeration 
yo, ¥1,--- Of all x formulas. For each e € w, we aim to satisfy the following 
requirement: 


Re: Pe is essential — (AFo € Po) (AF, © P1) [ve (Fo, Fi)]- 


We actually construct the sets Po and P; first, and then define c appropriately after. 
We proceed by stages. At stage s, we define approximations Pp, and P;,, to Po 
and P), respectively, with Po,; U Pi,,; = w [ s. For each e, we also define a number 
Me,s <5. 


Construction. To begin, set Poo = Pio = @ and me.o = 0 for all e. Declare all 
requirements active. Next, fix s € w and suppose Po,; and P;,, have been defined, 
along with m,.,, for each e. 

Choose the least e < s, if it exists, such that R, is active and there exist finite 
sets Fo < F, contained in [my, 5] for which vy, (Fo, F;) holds (with witness bounded 
by s). In this case, say that Re acts at stage s + 1, and declare it inactive. Let 


Po,s+1 = (Po,s O [0, me,s)) U [me,s, max Fo] 


and 
Piis4l = (P15 a [0, Me,s)) U (max Fo, s]. 


Since Po,; U P1,; = w | s by assumption, we have that Po,s41 U Pijs41 = w hs. 
For e* < e set Me*,s41 = Me*,s. For e* > e, set Mere = s+ 1, and declare Re» active. 

If no requirement acts at stage s + 1, simply set Po,s+1 = Po,s U {5s} and P1541 = 
P,\5, and me,s41 = Me for all e. This completes the construction. 


Verification. It is easy to verify, by induction on e, that every requirement acts at 
most finitely many stages. Indeed, for R_ to act at some stage, it must be active at the 
beginning of the stage, and at this point it is declared inactive. The the only reason it 
can act again is because some R-: for e* < e acts at a later stage, at which point R. 
is again declared active. By the same token, every requirement is active or inactive 
at cofinitely many stages. Notice, too, that if a requirement is inactive at cofinitely 
many stages then it is satisfied. 

Now fix any n € w. We claim that for each i < 2, Pj,s [in is the same for all 
sufficiently large s. This is trivially true if no requirement ever acts at any stage 
after n. Otherwise, choose any stage sg > n at which some requirement, say R,,, 
acts. Let s; > so be such that no requirement R, with e < eo acts again at any 
stage s > s,;. Then for any such s, the only requirements R, that can act satisfy 
Me,s 2 Me,s, 2 n, and hence P;,, [n = P;,s, }n. This proves the claim. Now for 


286 9 Other combinatorial principles 


each i < 2, set P; = {x € w: (V%s)[x € P;,s]} for each i. By construction, every 
x € w belongs to exactly one of Po or P}. 

We next claim that each R- is satisfied. We proceed by induction on e. Assume 
all R.« for e* < e are satisfied, and let sg be a stage such that no such R,« acts at 
stage sq or after. Seeking a contradiction, suppose , is not satisfied. Then R, must 
be active at every stage s > so. Also, ye must be essential, else Re would be satisfied 
trivially. So, there exist finite sets Fo, F, such that me,s, < Fo < Fi and ye(Fo, Fi) 
holds. Choose any s > max(so, max F;) and large enough to bound the existential 
quantifier in ye(Fo, F\). By construction, mes = Me,s,, 80 Fo and F) are contained 
in [m,, s). But then R, acts at stage s, becoming inactive, a contradiction. 


To complete the proof, we define a computable stable semi-transitive coloring c 
so that for each i < 2, P; = {x € w: limy c(x, y) = i}. Given x < s, let c(x, 5) =i 
for the unique i such that x € P;,,. Since the construction is computable, so is c. 
Since, for each x and eachi < 2, P;,, [x +1 is the same for all sufficiently large s, it 
follows that lim, c(x, s) exists. Hence, c is stable, and Po and P, are the desired sets. 

It remains only to show that c is semi-transitive. Fix x < y < z such that 
c(x, y) = c(y, z) = 0. We claim that c(x, z) = 0. Seeking a contradiction, suppose 
not, so that c(x,z) = 1. Then x € Po and x € Pj,,, so we can fix the largest 
s < z so that x € Pos. In particular, y < s and x € Pj\s541. By construction, 
Piis4t = (P15 M [0,me,s)) U (max Fo, s] for some finite set Fo found during the 
action of some strategy R, at stage s+ 1 of the construction. We must thus have that 
Me,s < x, and since x < y < 5, it follows that also y € P1\54;. But y € Po,-, so we 
can fix the largest stage s* with s+ 1 < s* < z such that y € P,5*. By construction, 
Po,s*41 = (Po,s* 1 [0, me-,s*)) U [Mmex,s*, max Fol for some finite set va found during 
the action of some strategy Re» at stage s* + 1 of the construction. If e* > e then 
Me*,s* > Me*,541 = St+1 > y, So y, being an element of P),,*, could not be an element 
of Po,s*41- Thus, e* < e, and hence me+5* < Me,s* = Mes < X < y < max Fh. It 
follows that x € Po,s*41, which contradicts the choice of s since s*+1 > s. oO 


Of the implications proved above, the only one remaining unaddressed is COH > 
CADS in Proposition 9.2.15, and this is because whether or not CADS implies COH 
remains open. 


Question 9.2.26. Is it the case that RCAg + CADS — COH? 


But recall that CADS does imply COH over RCAg + Bx? (Proposition 9.2.15 (4)), 
so a negative answer to the question would necessarily need to involve nonstandard 
models. We summarize everything in Figure 9.1. 


9.2.4 Variants under finer reducibilities 


We conclude this section with a brief discussion of some related results, mostly 
omitting proofs. The main focus here is on technical variations of the principles CAC 
and ADS. 
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ACA 
SRT3 + COH <> RT5 WKLo 
SCAC + ADS <> CAC SRT 
J \ 
SADS + CADS <> ADS WSCAC <-> SCAC DNR 
COH SADS 
CADS Bx} 
RCAg 


Figure 9.1. The location of CAC, ADS, and their stable and cohesive variants alongside RT5. 
Arrows denote implications over RCAg; double arrows are implications that cannot be reversed. 
Except for a potential arrow from CADS to COH, no additional arrows can be added. 


A quick inspection of the proofs shows that most of the implications illustrated in 
Figure 9.1 are actually formalizations in RCAo of finer reducibilities. Indeed, almost 
all of them are Weihrauch reductions. Astor, Dzhafarov, Solomon, and Suggs [6] 
observed that, under <w, some of the principles we have considered can break apart. 
More precisely, they noticed that the definition of ascending or descending sequence 
in the formulation of ADS can be computably weakened, as follows. 


Definition 9.2.27 (Astor, Dzhafarov, Solomon, and Suggs [6]). 


1. Let (LZ, <z,) be an infinite partial order. A subset S of L is 


* an ascending chain if every x € S is <,-small in S, 
¢ a descending chain if every x € S is <,-large in S. 
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2. ADC is the following statement: every infinite linear order has an infinite as- 
cending chain or descending chain. 

3. SADC is the following statement: every infinite stable linear order has an infinite 
ascending chain or descending chain. 


Ascending/descending chains are analogous of limit homogeneous sets: either all 
elements are small, or all elements are large. (Thus, SADS is to SRT; as SADC 
is to D*.) By Exercise 9.13.2, every infinite ascending or descending chain can 
be computably thinned to an ascending or descending sequence (in the sense of 
Definition 9.2.2). In particular, we have that ADC =, ADS and SADC =, SADS. By 
contrast, we have the following. 


Theorem 9.2.28 (Astor, Dzhafarov, Solomon, and Suggs [6]). The following 
hold: SADS ¢w ADC and SADS ¢w D?. 


Combining this with Theorem 9.1.18, that SRT; <w D?, we see that under Weihrauch 
reducibility, no relationships hold among the problems RT”, ADS, ADC, SRT’, D?, 
SADS, and SADC except the “obvious ones’. (Here, we are thinking of ADS and 
SADS as the restrictions of RTS and SRT3, respectively, to transitive colorings. The 
equivalences between these formulations in Propositions 9.2.7 and 9.2.12 are also 
Weihrauch reductions.) See Figure 9.2. 

One implication we have seen above that is not, on its face, a finer reduction is 
the one from SCAC to WSCAC in Theorem 9.2.11. In the other direction, of course, 
we clearly have a finer reduction, since SCAC is just a subproblem of WSCAC. But 
to show that SCAC implies WSCAC we needed to use SCAC twice. And indeed, this 
is unavoidable, as the following shows. 


Theorem 9.2.29 (Astor, Dzhafarov, Solomon, and Suggs [6]). The following 
holds: WSCAG €_ SCAC. 


As pointed out in [6], it may be tempting to ascribe this nonreduction to the fact 
that WSCAC allows for three “limit behaviors” (small, large, and isolated) while 
SCAC allows only for two (small and isolated, or large and isolated). In this way, we 
may be reminded of Theorem 9.1.11, that RT; Lo BL. But curiously, WSCAC is 
computably reducible (in fact, Weihrauch reducible) to SRT. (See Exercise 9.13.3.) 
So, cardinality cannot be all that is going on. And indeed, while the original proof 
of Theorem 9.2.28 was a direct and somewhat blunt forcing argument, a subsequent 
alternative proof due to Patey [241] helped better elucidate some of the computabil- 
ity theoretic underpinnings of the separation. The idea of Patey’s argument uses 
preservation properties similar to those used in the proof of Theorem 9.2.22. 

With partial orders, too, one can consider some variations detectable under finer 
reducibilities. One very interesting example is the following. 


Definition 9.2.30 (Hughes [163]; Towsner [311)). 


1. Let (P, <p) be a partial order. A set S € P is w-ordered if x <p y implies 
x < y, for all x,y € P. 
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SRT? CAC 
WSCAC ADS 
D? SCAC ADC 
SADS 
SADC 


Figure 9.2. Relationships between CAC, ADS, ADC, and their stable forms under Weihrauch 
reducibility. All arrows (reductions) are actually strong Weihrauch reductions, and no additional 
arrows can be added. 


2. CAC“ is the following statement: every infinite w-ordered partial order has an 
infinite chain or antichain. 

3. SCAC* is the following statement: every infinite w-ordered stable partial order 
has an infinite w-ordered chain or antichain. 


The motivation for looking at w-ordered partial orders is that, in virtually all 
applications—such as when a partial order is defined as part of a proof involv- 
ing CAC—it ends up being w-ordered. In this respect, partial orders that are not 
w-ordered show up less often, and as such are a bit less natural. So we might view 
the comparison of CAC* to CAC as a comparison of “natural uses” of CAC to “all 
possible uses in principle”. 

The following result may be interpreted as saying that, under provability over 
RCApo, the “natural cases” are all there is. 


Proposition 9.2.31 (Hughes [163]; Towsner [311]). RCAo proves CAC & Cac", 
By contrast, under computable reducibility there is a detectable difference. 


Theorem 9.2.32 (Hughes [163]). The problem CAC! admits AS solutions. Hence, 
CAG 2. CAC™. 


Proof. We prove the result in the unrelativized case. Fix an infinite computable 
w-ordered partial order (P, <p). If this has an infinite computable chain, we are 
done. So suppose otherwise. We build a @’-computable sequence of nonempty finite 
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chains Fo < F; <--- for <p such that for every i and every j < i, the <p-largest 
element of F; is <p-incomparable with every element of F;. In particular, the set 


{x € P: (di)[xis <p -largest in F;]} 


will be a @’-computable antichain for <p. 

Fix and assume we have defined F; for all j < i. Furthermore, assume inductively 
that for each j < i, the <p-largest element x of F; satisfies that there is no y € P 
with x <p y. Since <p is w-ordered, this implies that x is isolated in P. Fix 
z > max U;<; F; such that for all j < i, if x is the <p-largest element of F; then 
x |p y forall y > zin P. This can be found uniformly @’-computably. We define F; 
as follows. Let xo be the least y > z in P and put this into F;. Now suppose we have 
defined x, € F; for some s € w. If there is a y € P such that x, <p y, let x,,; be the 
least such y and put this into F;. Otherwise, stop. 

Clearly, F; is a chain for <p. And our assumption that <p has no infinite com- 
putable chain implies that F; is finite. By construction, if x is the <p-largest element 
of F; then there is no y € P withx <p y, so the inductive hypotheses are maintained. 
This completes the construction. 

To complete the proof, recall the aforementioned result of Herrmann [142], that 
CAC omits AS solutions. Hence, CAC <, CAC, as desired. Oo 


Surprisingly, the above does not go through for stable partial orders. 
Theorem 9.2.33 (Hughes [163]). SCAC =, SCAC™. 


Proof. Fix an infinite partial order (P, <p). For simplicity, assume it is computable. 
The general case follows by relativization. Also, assume that every element of P is 
either <p-small or <p-isolated. The case where every element is either <p-large 
or <p-isolated is symmetric. Now define a computable partial ordering <g of P as 
follows: for x, y in P, set x <g y if and only if x < y and x <p y. Thus, (P, <q) is 
w-ordered. Let § C P be an SCAC™-solution to (P, <q). If Sis achain for <g, then 
it is also clearly a chain for <p, and thus serves as a SCAC-solution to (P, <p). So 
suppose instead that S is an antichain for <g. By definition of <g, if x is any element 
of S then we must have x €p y for all y > x in S. Since <p is stable, this means 
every x € Sis <p-isolated. Using Exercise 9.13.2, we can consequently computably 
thin S out to an infinite antichain for <p. oO 


However, the principles SCAC and SCAC“ can be separated by finer reducibilities. 


Theorem 9.2.34 (Hughes [163]). 


1. SCAC ¢<w SCAC, 
2. SCAC ¢.4 SCAC™, 


The proofs of both results are forcing arguments. Part 2 employs a combinatorial 
elaboration on the tree labeling method, which we saw used in the proof of Theo- 
rem 9.1.1. 


9.3 Polarized Ramsey’s theorem 291 


9.3 Polarized Ramsey’s theorem 


In what follows, it is helpful to recall Definition 3.2.4, and the convention following 
it. Namely, for a set X, [X]” denotes the collection of all finite subsets of X of size 
n, not just increasing n-tuples of elements of X, even though we usually treat the two 
as interchangeable. Here, the distinction is important. 


Definition 9.3.1. Fix n,k > 1 andc: [w|”" — k. A tuple (Ao, ..., H,_1) of subsets 


of w is: 
1. p-homogeneous for c if c is constant on all finite sets F = {xo,...,X,—-1} with 
xo € Ho,...,Xn-1 € An-1 and |F| =n, 
2. increasing p-homogeneous for c if c is constant on all finite sets F = 
{xo,...,Xn-1} with x9 € Ao,...,Xn-1 € Hy; and xp < +++ < Xy-1. 
A p-homogeneous or increasing p-homogeneous set (Ho,...,Hn-1) is infinite if 
each of Ho,..., H,—1 is infinite. 


Alternatively, if we wanted to stick with just considering colorings defined on in- 
creasing tuples, then (Ho,...,H,-1) would be p-homogeneous if c were constant 
on all Cartesian products of Ho,..., Hy), in any order. And (Ho, ..., Hn-1) would 
be increasing p-homogeneous just if c were constant on Hp X --- X Hy-1. 

We now have the following “polarized” analogue of Definition 3.2.5. 


Definition 9.3.2 (Polarized and increasing polarized RT). 


" _» k has an infinite 


1. Porn, k > 1, PT; is the following statement: every c: [w] 
p-homogeneous set. 

2. Forn, k > 1, IPT; is the following statement: every c: [w]” — k has an infinite 
increasing p-homogeneous set. 

3. Forn > 1, PT” and IPT” are (Vk)PT? and (Vk)IPT?, respectively. 

4. PT and IPT are (Wn)PT” and (Wn)IPT”, respectively. 


As with Ramsey’s theorem, RT} is trivially provable in RCA, and for all k > 2, RT; 
is equivalent over RCAg to RT7, so we usually just stick to the latter when discussing 
implications and equivalences. 

Various kinds of polarized partition results have been studied extensively in 
combinatorics (see, e.g., Chvatal [46]; Erdés and Hajnal [95], and Erddés, Hajnal, 
and Milner [96, 97]). The specific principles PT and IPT, inspired by these results, 
were first formulated by Dzhafarov and Hirst [83] specifically in the context of 
reverse mathematics. 

Note that the parts of a p-homogeneous set need not be disjoint. Thus, if we have 
an infinite homogeneous set H for a given coloring of [w]”, then setting Hp =--- = 
H,-\ = H yields an infinite p-homogeneous set (Ho, ..., H,_1). Therefore, we have 
PTX <sc RT,. Also, every p-homogeneous set is clearly increasing p-homogeneous, 
so IPT; is a subproblem of PT;’. By the same token, RCAg # RTS — PTS — IPT?. 
(And all the same is true if we quantify out the colors and/or the exponents.) For 
n > 3, these implications reverse. 
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Proposition 9.3.3 (Dzhafarov and Hirst [83]). Fix n > 3. 


1. RCAg F RT! © PT! & IPT? 
2. RCAg F RT” © PT” > IPT”, 
3. RCAg F RT © PT © IPT. 


The reversals consist in showing that IPTS — ACAo and IPT — ACA¢ and appealing 
to the equivalences of these with RTS and RT (Theorem 8.1.6 and Corollary 8.2.6), 
respectively. 

More interestingly, we also have an equivalence in the case of n = 2, at least 
between Ramsey’s theorem and the polarized Ramsey’s theorem. This is somewhat 
surprising, as the solutions for these two principles are otherwise quite different. 
Consider the following example from [83]: for numbers x < y, let c(x, y) be 0 or 1 
depending as x and y have different or like parity. Then (Ho, H,), where Hp is the set 
of even numbers and H the set of odds, is an infinite p-homogeneous set for c, with 
color 0. But no infinite set can be homogeneous for p with color 0. This highlights 
part of the combinatorial difference between the two principles. As we will see, their 
equivalence over RCAg is rather indirect. 


Theorem 9.3.4 (Dzhafarov and Hirst [83]). RCA + RT}  PT3. 


Proof. As noted, RTS > PL. For the converse, we argue in RCAg and show 
that PT, implies each of Ds and ADS. Since ADS — COH by Theorem 9.2.8, it 
follows that Pi, > RTS by Theorem 8.5.1 (and the equivalence of D5 with SRT}, 
Corollary 8.4.11). 

First, fix a stable coloring c: [N]? — 2. Regarding this as an instance of Pl. let 
(Ho, H,) be an infinite p-homogeneous set for c, say with color i < 2. We claim that 
H is limit homogeneous for c with color i. Indeed, fix x € Hp. Then for all y € Hy 
different from x we have that c(x, y) =i. But as H is infinite, it must be that i is the 
limit color of x. 

Now let <z, be a linear ordering of N. Let c: [N]? — 2 be the induced coloring 
of pairs: that is, for all x < y, 


as 0 ify <,x, 
c(x,y) = 
aa ee 


Let (Ho, H1) be an infinite p-homogeneous set for c, say with color i < 2. By 
primitive recursion, define a sequence xp < x; < --- as follows: x9 = min A; given 
x2; for some j > 0, let x2;41 be the least x > x2j4; in Hj; given x2;41, let x2;+2 be 
the least x > x2;41 in Ho. By definition, c(xx, x41) = # for all k. Hence, if i = 1 then 
Xo <p .X%1 <p +--+, andif i =Othen xp >y x; > ---. Thus, {xo, x1,...} is either an 
ascending or descending sequence for <;. oO 


Let us break this down a bit more. The proof shows that each of Be and ADS is 
computably reducible to PT3, and we also have SRT3 XK Be and COH < ADS by 
Proposition 8.4.8 and Theorem 9.2.8. In terms of the compositional product from 
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Definition 4.5.11, Theorem 8.5.4 says that RTS Xe SRT} *COH. Thus, what we really 
showed above proof is that RT3 XK PT? * PT2. This raises the following question. 


Question 9.3.5. Is it the case that RTS <,. PT}? 


Another open question is whether Theorem 9.3.4 can be strengthened by replacing 
PTS with IPTS. 


Question 9.3.6. Is it the case that RCAg + IPT; > RT? 


We can get a lower bound on the strength of IPT? by considering the stable versions 
of each of PT; and IPT. Let SPT; and SIPT? be the restrictions of each of these 
principles, respectively, to stable colorings. 


Proposition 9.3.7 (Dzhafarov and Hirst [83]). 


1. RCAg + IPT; > SRT 
2. RCAy t SRT5 <> SPT3 © SIPT3. 


Proof. Again, the case k = | is trivial, so assume k > 2. The argument that 
PT; > SRT? in Theorem 9.3.4 actually shows that SIPT? = SAT: . Clearly, 
IPT? > SIPTz, so we have (1). Also, SRTz > SPT? > SIPT?, since each principle 
in this chain of implications is a subproblem of the previous. Thus we also have (2).0 


It turns out that part (2) also holds under Weihrauch reducibility, but not under any 
finer reducibility notions. 


Theorem 9.3.8 (Nichols [232]). 


1. SRT} =w SPT} =w SIPT3. 
2. SRT, #6 SPT;. 

32SP15 & SIPT;: 

4. SIPT} £sc D5. 


This complements Theorems 9.1.18 and 9.2.34. Collectively, these results suggest 
that in the pantheon of reducibilities we work with, <,, is best at distinguishing results 
on the basis of combinatorial (as opposed to computability theoretic) properties. Parts 
(2)-(4) are proved by a forcing argument using the tree labeling method. Part (3) has 
a particularly intricate combinatorial core, reflective of how close SPI, and SIPT, 
are to one another. 

Finally, we present a result due to Patey [240] showing that Proposition 9.3.7 (1) is 
a strict implication. For this, we let DNR(@’) be the principle asserting that for every 
set X, there exists a function which is DNC relative to X’. The following somewhat 
more careful formulation shows how this statement is formalized in RCAo. 


Definition 9.3.9. DNR(@’) is the following statement: for every set X there exists a 
function f: N — N such that for all e € w, if o € 2<® is such that ®Y(e) | and 
o- (i) = 1 if and only if ©; (7) | for alli < ||, then f(e) # BY (e). 
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Theorem 9.3.10 (Patey [240]). RCAo + IPT; > DNR(2’). 


Proof. We argue in RCAo. First, we can formalize the $1” theorem (Theorem 2.4.10), 
in the manner of Section 5.5.3. Now fix X. Using the S'” theorem and the limit 
lemma (which is provable in [x°, as noted in Section 6.4) we can conclude that 
there is a function g: N? — 2 (of the form oF , for some i € N) such that for 
all e € N, lim, g(e, s) exists if and only if there exists aa € 2<N is such that 
®°(e) j= limys g(e, s) and o(i) = 1 if and only if ®,;(2) | for alli < |o|, and in 
this case lim, g(e, s) = BY (e). Therefore, it suffices to exhibit f: N — N such that 
f(e) # limy g(e, s) if the latter exists. 

The argument now is somewhat similar to Theorem 8.2.2. We define an instance 
c: [N]? > 2 of IPTS by induction. At stage s > 0, we define c(x, s) for all x < s. 
Let s > 0 be given. For each e < s, decode g(e, s) € Nasa tuple of numbers smaller 
than s of length 2e + 2. Let Des be the finite set of these numbers. 

Each e < s now claims a pair of elements of D-,;. Namely, e claims the least two 
elements of D.,; that are not claimed by any e* < e. (Note that e* can only claim 
elements of D,:,s, but De«,, and D¢_, can intersect.) Since D.|, has size 2e + 2, 
there are guaranteed to be at least two elements for e to claim. Say e claims x < y, 
and set c(x, s) = 0 and c(y, s) = 1. Finally, fix any x < s not claimed by any e < 5 
and set c(x, s) = 0. This completes the construction. 

It is clear that c exists and that it is an instance of IPT2. Let (Ho, H,) be an 
infinite increasing p-homogeneous set for c. Define f: N — N as follows: for all 
e, let f(e) code the least 2e + 2 many elements of Ho. Seeking a contradiction, 
suppose lim, g(e, s) exists and f(e) = lim, g(e, s). Fix so so that g(e, s) = g(e, so) 
for all s > so. Then for all s > so, De,, equals the first 2e + 2 many elements of 
Ho. Hence, by construction, for all s > so there exist x,y € Des © Ho such that 
c(x, s) # c(y, 5). But since H is infinite we can choose such an s > so in Hj, and 
then we have a contradiction. oO 


Corollary 9.3.11 (Patey [240]). RCAg ¥ SRT; > IPT3 


Proof. Consider the model M of Chong, Slaman, and Yang [39] from Theorem 8.8.5. 
This model satisfies RCAg + SRT3, plus every X € S™ is low in M. But clearly no 
f as in the definition of DNR(@’) can be low in M. So, IPT. must fail in M. im 


The Monin—Patey theorem (Corollary 8.8.8) also suggests the following question. 


Question 9.3.12. Is it the case that IPTS <w SRT}? 


9.4 Rainbow Ramsey’s theorem 


The principle DNR(@’) defined at the end of the preceding section turns out to have 
a characterization as a variant of Ramsey’s theorem. Thus, its appearance above 
in connection with IPTS is not entirely random. It is actually a surprising fact that 
computability theoretic properties can be equivalent to purely combinatorial ones 
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in our framework. One occurrence of this that we have already seen is the degree 
characterization of the solutions of COH, in Theorem 8.4.13. We will see several 
others in Section 9.10 below. 

Our interest here will be in the following principle known as the rainbow Ramsey’s 
theorem, which was first studied in reverse mathematics by Csima and Mileti [59]. 


Definition 9.4.1 (Rainbow Ramsey’s theorem). Forn, k > 1, RRT; is the following 
statement: for all c: [w]” — wo, if |c7!(z)| < k for all z € w, then there exists an 
infinite set R € w such that d is injective on [R]”. 


The set R above is called a rainbow for the coloring f. A function c with the property 
that |c~!(z)| < k for all z is sometimes called k-bounded in the literature, but we 
will not use this term to avoid confusion with earlier terms like “k-valued” and 
“computably bounded”. 

The rainbow Ramsey’s theorem in some sense says the opposite of Ramsey’s 
theorem: instead of looking for a set on which the coloring uses just one color, it 
looks for a set on which it uses each color at most once. However, though it may not 
be obvious at first, the latter property is actually a consequence of homogeneity. 


Proposition 9.4.2 (Galvin, unpublished). 


eForalln,k > 1, RRTx is identity reducible to RT; 
@RCAg + (Vn)(Wk)[RTY — RRT{I. 


Proof. We prove (1). Letc: [w]” — wbeaninstance of RRT;. Define d: [w]” — k 
as follows: for all x € [w]", 


d(x) =|{y < maxx: y #XAc(y) =c(Xx)}I. 


As there are at most k + 1 many tuples of any given color under c, we have d(x) < k. 
Now let H be an infinite homogeneous set for d and suppose y, x € [H]”, say with 
y < maxx. Then if we had y # x and c(y) = c(X) we would have by definition that 
d(x) = d(y) + 1, which cannot be. We conclude that c is injective on [H]”. o 


Csima and Mileti [59] undertook an in-depth analysis of the computability the- 
oretic and proof theoretic content of the rainbow Ramsey’s theorem. The previous 
lemma immediately implies that RRT;, inherits the upper bounds on its strength from 
RT;,. Interestingly, in terms of the arithmetical hierarchy, it also has the same lower 
bounds. Thus, we have the following analogue of Theorems 8.1.1 and 8.2.2. 


Theorem 9.4.3 (Csima and Mileti [59]). Fix n,k > 1. 


1. For each n,k > 1, RRT; admits rn’? solutions and ACAg + RRT;. Hence, also 
ACAS + (Wn)(WK)RRT%. 

2. For each n > 1 and k > 2, RRT; omits x solutions. Hence, ACAg ¥ 
(Wn)(Vk)RRT{. 
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However, in other ways the rainbow Ramsey’s theorem turns out to be notably 
weaker than Ramsey’s theorem. The following result of Wang resolved a longstanding 
question about whether an analogue of Jockusch’s Theorem 8.2.4 holds in this setting 
(i.e., whether we can code the jump). 


Theorem 9.4.4 (Wang [321]). (Wn)(Wk)RRT; admits strong cone avoidance. 
Hence, TS €w (Wn)(Wk)RRTY and RCAp ¥ (Vn)(Vk)RRTZ — ACAo. 


Even though the n = 2 case of the rainbow Ramsey’s theorem is no longer as 
special as it is for Ramsey’s theorem, it is nonetheless of special interest. Arguably 
the most striking result of Csima and Mileti is an unexpected connection between 
RRT;, and algorithmic randomness. We pause here to go over some basic notions 
from this subject. We will need these again in Section 9.11. 

We begin by recalling Cantor space measure (or “fair coin” measure), ft. AS 
usual, this is first defined on basic open sets: for a € 2<®, the measure of [[o]] 
is u({[o]]) = 27!7!. Now, consider an arbitrary open set, U C 2”. This can be 
written (not uniquely) in the form U,ey[[o]], where U € 2“ is prefix free (i.e., 
if o <1 € U theno ¢ U). Then (ZU) is defined to be ) cy 27|7!, and it can be 
shown that this value does not depend on the choice of U. (See Exercise 9.13.5). 
From here, the definition of jz is extended to other subsets of 2“ as in the usual 
definition of Lebesgue measure on R. First, every set C is assigned an outer measure 
EL’ (C), equal to the infimum over all open sets U D2 C of u(U). Second, every such 
C is assigned an inner measure .(C) = 1 — u*(2° \ C). If w*(C) = ps (C) then C 
is called measurable, and 1(C) is defined to be u*(C). 

We can now pass to one of the central notions of algorithmic randomness. In what 
follows, say a sequence of (U,, : n € w) of subsets of 2° is uniformly Da if there is 
a uniformly c.e. sequence (U,, : n € w) of subsets of 2“ such that U/,, = [[U]],, for 
all n. 


Definition 9.4.5 (Martin-Lof [205]). 


1. A Martin-Léf test is a uniformly x? sequence (U,, : n € w) of subsets of 2° 
such that u(U,) < 2 for all n. 

2. A set X passes a Martin-Lof test (WU, :n € w) if X ¢(),[[Un]. 

3. A set X is 1-random (or Martin-Léf random) if it passes every Martin-Lof test. 

4. For n > 1, a set X is n-random if it is 1-random relative to a("-)). 


The intuition here is that if X € 2 is random, there should be nothing special that 
we can say about it. That is, X should not satisfy any property that can be effectively 
described and is not enjoyed by all (or almost all, in the sense of measure) sets. 
For example, an X whose every other bit is 0, or whose elements as a set form 
an arithmetical progression, or whose odd and even halves compute one another— 
intuitively, none of these sets should be random. (See Exercise 9.13.4.) A Martin-Lof 
test is thus a test against some such property, with the measure condition on the 
members of the test ensuring that it captures only measure 0 many sets. A 1-random 
set, being one that passes every test, should thus be random in the intuitive sense. 
We leave to the exercises the following well-known and important facts. 
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Theorem 9.4.6 (Martin-Lof [205]). There exists a Martin-Lof test (U, : n € w), 
called a universal Martin-Lof test, such that X is 1-random if and only if it passes 
(Un 2 n € w). 


Corollary 9.4.7. There exists a nonempty mi? class all of whose elements are 1- 
random. 


Of course, there is much more that can be said here, and there are many other 
definitions of randomness that have been considered (some equivalent to the above, 
some not). Our discussion will be confined to a handful of applications, so we 
refer the reader to other books—most notably those of Downey and Hirschfeldt [83] 
and Nies [233]—for a thorough discussion, including a comparison of randomness 
notions, connections with other computability theoretic properties, and historical 
background. For an overview of recent developments in the subject, see [106]. 

Let us now return to our discussion of the rainbow Ramsey’s theorem. The 
hallmark result of [59] is the following. 


Theorem 9.4.8 (Csima and Mileti [59]). Fix k > 1 and A € 2°. Ifc: [w]? > k 
is an A-computable instance of RRT;, and X is \-random relative to A’ then A ® X 
computes an RRT? -solution to c. 


Essentially, this says that if we are given an instance of RRT;, and guess numbers 
randomly enough, then we will obtain a solution to this instance. Intuitively, this 
makes sense. Instances of RRT; are almost injective to begin with (on all of [w]7), 
and given that k is fixed, the probability of picking two numbers colored the same 
should be low. Still, it is remarkable that this can be made precise and that it is true. 

To better understand Theorem 9.4.8, and RRT;, in general, we need several aux- 
iliary combinatorial notions. The first may be regarded as a weak form of stability. 
(For other definitions of stable instances of RRT2, see Patey [240].) 


Definition 9.4.9. Fix k > 1. An instance c: [w]* > w of RRT? is normal if for all 
Xo < x1 and yo < y1, if x1 # y; then c(xo,%1) ¥ C(yo, y1). 


Proposition 9.4.10 (Csima and Mileti [59]). For alln,k > 1, ifc: [w]" > wis 
an instance of RRT;, there exists an infinite c-computable set I © w such that c }[I]" 
is normal. 


We leave the proof to the exercises. 


Definition 9.4.11. Fix n,k > 1 and let c: [w]* — w be an instance of RRT;.. 


1. If F C wis finite, then Viab.(F) = {x > F: c }[F U {x}]? is injective}. 
2. F C wis admissible for c if F is finite and Viab, (F) is infinite. 


The following result says that if a finite set F looks like an initial segment of a 
solution to c (because c [[F]* is injective) and F has infinitely many one-point 
extensions that still look like an initial segment to a solution, then in fact, F can 
be extended to a solution. This makes RRT’-solutions quite special. (The analogous 
result for Ramsey’s theorem is very much false.) 
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Proposition 9.4.12 (Csima and Mileti [59]). For all k > 1, ifc: [w]*? > wis a 
normal instance of RRT;, and F © w is admissible for c then 


{x € Viab.(F) : F U {x} is not admissible for c}| < k + |F|. 


Proof. If k = 1 there is nothing to prove, so assume k > 2. Letn = k-|F|. Seeking a 
contradiction, suppose there are at least n+ 1 many x € Viab,(F) such that FU {x} is 
not admissible for c. Let xo,...,x,, witness this fact. Thus Viab, (F U {x;}) is finite 
foreachi < n. Since F is admissible for c, we can choose az > Uj<, Viabe (FU {x;}) 
in the infinite set Viab,(F). 

Fixi < n. By choice of z, c is not injective on [F U{x;, z}]*. But x;, z € Viab..(F), 
so c is injective on both [F U {x;}]* and on [F U {z}]?. It follows that one of the 
witnesses to noninjectivity of c on [F U {x;, z}]* must be (x;,z). And since c is 
normal, the other witness must be (y, z) for some y € F. 

We conclude that for eachi < n there exists a y; € F such that c(x;, z) = c(i, Z). 
The assignment i + y; defines a map k - |F| — |F|. If the preimage of each 
y € F under this map had size smaller than k, then the size of the full preimage 
of F would have size smaller than k - |F|. So we can fix a y € F whose preimage 
G € {0,1,...,m} has size at least k. Thus, y,; = y forall i € G. But then for alli ¢ G 
we have c(x;,z) = c(y, z). Since all the x; for i € G are distinct, and y is distinct 
from each of these, we have at least k + 1 many pairs all with the same color under 
c. This contradicts the fact that c is an instance of RRT;. oO 


We now come to the aforementioned characterization of RRT3 in terms of DNC 
functions relative to 2’. 


Theorem 9.4.13 (Miller, unpublished). Fix k > 2. 


1. DNR(@’) =w RRTZ. 
2. RCAg + DNR(@’) @ RRTz. 


Proof. We prove (1). The proof uses Propositions 9.4.10 and 9.4.12, and each of 
these can be formalized in RCAp easily enough. It is then straightforward to formalize 
the argument and obtain (2). 


(DNR(@’) <w RRT}). An instance of a DNR(@’) is a set, A, and a solution is a 
function DNC relative to A’. We prove the result for A = @. The general case then 
follows by relativization. It therefore suffices to build a computable instance c of 
ART, every solution to which computes a function DNC relative to @’. In fact, the 
c we build will be an instance of RRT>. 

We construct c by stages, very similarly to the coloring in the proof of Theo- 
rem 9.3.10. As there, let g: w* > 2bea computable function such that for all e, if 
©?’ (e) | then lim, g(e, s) |= ®% (e). At stage s > 0, define c(x, s) for all x < s. For 
each e < s, decode g(e, s) as a tuple of numbers smaller than s of length 2e +2 and 
let De,s be the set of these numbers. Each e < s now claims the least two elements 
x, y of De_s not claimed by any e* < e and sets c(x, 5) = c(y, 5) = (e, 5). At the end, 
for any x < s not claimed by any e < s let c(x, s) = (x + s,s). This completes the 
construction. 
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It is ey to see that |c7!(z)| < 2 for all z € w. Let R = {ro <r} < ---} be 
any RRT; -solution to c. Define f <r R as follows: for all e, let f(e) be a code 
for the least 2e + 2 many elements of R. We claim that f is DNC relative to 2’. If 
not, B.(e) |= f(e) for some e. Fix s larger than all elements in the tuple coded by 
f(e) and large enough so that g(e,t) = ®,(e) for all t > s. Since R is infinite, we 
can without loss of generality take s € R. Then at stage s of the construction, Des 
consists of the least 2e +2 many elements of R, and we set c(x, s) = c(y, s) for some 
x,y € De.s. Since (x, 5), (y, 5) € [R]’, this contradicts that f [[R]? is injective. 


(RRT;, <w DNR(@’)). Fix k > 1 and let c: w — w be an instance of RRT;. Again, 
we deal with the unrelativized case, so assume c is computable. By Proposition 9.4.10 
there is an infinite computable J C w such that c [[/]? is stable. By passing everything 
through a computable bijection, we may assume I = w for ease of notation. 

Let f be DNC relative to @’. We define an infinite f-computable set R = {ro < 
r; < +++} such that for each 7, Rj = {r; : j < i} is admissible for c. Notice that 
this implies that c is injective on [R]*. The conclusion is trivially true for Ro = @ 
Assume, then, that for some i € w we have defined r; for all j < i and that R; is 
admissible for c. Say Viab.(R;) = {xp < x1 <---}, and without loss of generality, 
assume xo > R;. Define 


W ={m €w: R; U {xm} is not admissible for c}. 


By Proposition 9.4.12, |W| < k-|R;| = ki. Notice that Viab, (Ri iy is uniformly 
computable in R;, so W is ae i in 7 and hence uniformly ae 

So, for each € < ki, we can aniternly computably find an jade ee or that, if 
the th number enumerated into W is m, then oF (x) |= m(€) for all x, where we 
interpret m as a code for a string of length ki. Now, let m = (f(ee) : € < ki). Then 
for all £ < ki we have m(€) = f(ec) # ©? (ee), so m cannot belong to W. Hence 
Viab.(R;) U {xm} is admissible for c, and we define r; = x. oO 


The connection between the rainbow Ramsey’s theorem for pairs had not been 
observed in, say, Ramsey’s theorem, and so initially appeared somewhat mysterious. 
Miller’s theorem (Theorem 9.4.13) clarifies this connection in a number of ways. For 
one, it yields Theorem 9.4.8 by relativizing to @’ the following well-known result of 
Kuéera. Most proofs of this result in textbooks use Kolmogorov complexity, so for 
convenience, we include a proof here that uses tests. 


Theorem 9.4.14 (Kuéera [192]). Every 1-random set X computes a DNC function. 


Proof. For each n € w, define 
Un = {X € 2% : (de > n)[®e(e) |= X fel}, 


where X [ e is interpreted as a code for a string of length e. Thus %/, = Us, [[Vnll. 
where 


o if ®(e) j=a€ 2°, 


@ otherwise. 


Vex (rer: nea) ={ 


300 9 Other combinatorial principles 


From here it is easy to see that (WU, : n € w) is a uniformly x? sequence and that for 
each n, 
w(Un) < >° w([[Val]) < D126 = 2". 
n>e n>e 

It follows that (WU, : n € w) is a Martin-Lof test. If X is 1-random, there is 
consequently an n such that X ¢ U,,. By construction, this means that for all e > n, 
if B,(e) | then ®,(e) # X fe. 

Define f <r X by letting f(e) be a code for X | e for all e. Then f(e) # ®,(e) 
for all e > n, so a finite modification of f is DNC. In particular, X computes a DNC 
function. Oo 


Some other results about the strength of RRT>, originally proved in [59] by 
different arguments, admit simplified proofs thanks to Theorem 9.4.13. One example 
is the following theorem. The proof we give here is obtained by combining separate 
results of Monin [216] and Patey [240]. 


Theorem 9.4.15 (Csima and Mileti [59]). The problem RRT; omits solutions of 
hyperimmune free degree. 


Proof (Monin and Patey, see [240]). We prove the theorem for computable in- 
stances of RRT5. The full result follows by relativization. In light of Theorem 9.4.13, 
it is enough to show that if f is DNC relative to @’, then f does not have hyper- 
immune free degree. Then the computable coloring c constructed in the proof that 
RRT; <w DNR(@’) will have no solution of hyperimmune free degree, completing 
the result. 

Seeking a contradiction, suppose otherwise. By Theorem 2.8.21, this means every 
f-computable function is dominated by a computable function. Consider the class 


U = {X € 2 : (Ae) [2 (e) |= X(e)]}. 


Clearly, X € U if and only if it is not DNC relative to @’. Hence, f ¢ U. Now, 
it is not difficult to check that U/ can be written as ,, C,, where each C, is a nm 
class, and whose index as such is uniformly computable in n. Let 7o,7;,... be a 
uniformly computable sequence of subtrees of 2<® with C, = [T,,] for all n. Thus, 
f eNall{o € 25% :0 ¢ Ta}. 

Define a function g <7 f as follows: given n, let g(7) be the least k € w such that 
f | k ¢T,. By assumption, g is dominated by some computable function, h. Let 


D= ( lite € 2"). ¢ T,}]]. 


Then f € D by definition of g and h. Since [[{a € 2" : o ¢ Ty}]] is a clopen 
set in the Cantor space (being a finite union of basic open sets) , D is actually a my 
class. And since D contains f, it is nonempty. Hence, we can fix a @’-computable 
path p € D. But D C 2° \ U, so p is also DNC relative to 2’. Obviously, this is 
impossible. oO 
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Corollary 9.4.16 (Csima and Mileti [59]). RRT; $~ WKL. Hence, also RCAop ¥ 
WKL — RRT3. 


Proof. Let S be an w-model of WKL consisting entirely of sets that have hyperim- 
mune free degree (Exercise 4.8.13). By Lemma Theorem 9.4.15, fix a computable 
instance c of RRT3 with no solution of hyperimmune free degree. Then c witnesses 
that S ¥ RRT5. o 


Overall, RRT; turns out to be a very weak principle. By Theorem 9.4.13, it 
implies DNR over RCAg. However, we now show that it implies neither SADS nor 
COH. There are two technical ingredients to this. The first is the following famous 
result of van Lambalgen. 


Theorem 9.4.17 (van Lambalgen [317]). Fixn > 1.A set X = Xp ® Xj is n-random 
if and only if Xo is n-random and X, is n-random relative to Xo. 


For a proof, see, e.g., Downey and Hirschfeldt [83, Theorems 6.9.1 and 6.9.2]. The 
second technical ingredient is the following. Here we will only it for n = 2, which is 
also the version proved in Csima and Mileti [59]. But the proof is the same for any 
n, and we will use the more general version in Section 9.11. 


Theorem 9.4.18 (Folklore). Fix n > 1 and let X be any n-random set. There exists 
an w-model S with the following properties. 


1. Every set in S is X-computable. 
2. For each Y € S there is a Y* € S which is n-random relative to Y. 


Proof. Fix X. Write X = @, X; and define 
S={¥ e2%: (AY <r PX} 
i<k 


Clearly, this is an w-model and every Y € S is X-computable. Fix any such Y, say 
with Y <p @,;-; Xi- Since 


x= (QD x: @ X) eX 
i<k i>k 


is n-random, it follows by van Lambalgen’s theorem that OQ; <k Ai B Xx 1S n-random. 
Hence, again by van Lambalgen’s theorem, X; is n-random relative to B ick Xi» and 
therefore relative to Y. Since X, € S, the proof is complete. oO 


Corollary 9.4.19 (Csima and Mileti [59]). Let X be 2-random. Then there exists 
an w-model of RRT; consisting entirely of X-computable sets. 


Proof. By Theorem 9.4.18 with n = 2 and Theorem 9.4.8. oO 


We now get a couple of separation results in rapid succession. 
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Corollary 9.4.20 (Csima and Mileti [59]). RRT} <.. RRT}. Hence, also RCAo ¥ 
RRT3 — RRT3. 


Proof. By Corollary 9.4.7 relative to @’, there exists a ne class all of whose 
elements are 2-random. Hence, there exists a ny 2-random set. Let S be a model 


of RRTS consisting entirely of X-computable sets, as given by Corollary 9.4.19. 
In particular, every element of S is Ne By Theorem 9.4.3, there is a computable 


instance of RRT; having no ps (hence, no AS) solution. This coloring witnesses that 
S ¥ RRT3. Oo 


A similar argument allows us to show that RRTS does not imply SADS. Here, we 
need one technical result, whose proof we omit. 


Theorem 9.4.21 (Mileti [211]). Jf S € 2° is hyperimmune then 
U({X € 2° : X computes an infinite subset of S)} = 0. 


Corollary 9.4.22 (Csima and Mileti [59]). SADS <,, RRT3. Hence, also RCAg ¥ 
RRTS — SADS. 


Proof. Let (L, <i) be the computable SADS-instance given by Proposition 9.2.17. 
Thus, every solution to this instance is hyperimmune. Let A C L be the set of all 
<,-small elements of L, and B C L the set of all <,-large elements of L. So also 
each of A and B must be hyperimmune, and every solution to (L, <,) is a subset of 
A or B. It follows that 


u({X € 2° : X computes an SADS-solution to (L, <y)}) = 0. 


But the measure of all 2-random sets is 1, so we can choose some such X that 
computes no solution to (L, <,). By Corollary 9.4.19, let S be an w-model of RRT; 
consisting entirely of X-computable sets. Then (L, <,) witnesses that S ¢ SADS.o 


The weakness of RRT3 is further confirmed by looking at its first order part. We 
do this briefly, without giving any proofs, which tend to be rather involved. The first 
order consequences of RRT; were first considered by Conidis and Slaman [55], who 
established the following conservation result. 


Theorem 9.4.23 (Conidis and Slaman [55]). The system RCAg + RRT3 is II}- 
conservative over RCAg + BES. 


This was subsequently expanded as follows. 
Theorem 9.4.24 (Slaman, unpublished). RCAy # RRT; > BE3. 


In fact, both results hold for the stronger principle 2-RAN in place of RRT3, which 
is the formal analogue of the existence of a 2-random set. This implies RRT; by 
formalizing Theorem 9.4.8 in RCAo. (For a thorough investigation of this principle 
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in second order arithmetic, including alternative formalizations, see Avigad, Dean, 
and Rute [9].) 

However, RRT; does not have entirely trivial first order part either. The following 
principle was introduced by Seetapun and Slaman [275]. 


Definition 9.4.25. Let be a collection of formulas of £2. The T' cardinality scheme 
(CI) is the scheme consisting of all sentences of the form, for yg € I: if g(x, y) 
defines an injective function then it has unbounded range. 


Theorem 9.4.26 (Conidis and Slaman [55]). RCAg + RRT; > CX’. 


Seetapun and Slaman [275] showed that cx} is not provable in RCAp. But beyond 
this, it is very weak indeed.For example, Slaman (unpublished) showed that for all 
n, 1=2 + A, CX2, does not imply Bx? , over RCAo. 

We conclude this section with an observation and an open-ended question. The 
proof of Proposition 9.4.2, that RT; implies RRT;, suggests more generally that 
any kind of structure satisfying a version of Ramsey’s theorem should also satisfy 
a version of the rainbow Ramsey’s theorem. We will explore a number of other 
combinatorial relatives of Ramsey’s theorem in the rest of this chapter, and for 
basically all of them no rainbow version has yet been studied. We can thus ask the 
following. 


Question 9.4.27. What other variants of Ramsey’s theorem admit a rainbow ver- 
sion? How does it relate to the rainbow Ramsey’s theorem? How does it relate to 
algorithmic randomness? 


9.5 Erdés—Moser theorem 


The next variant of Ramsey’s theorem we consider is the so-called Erdés—Moser 
theorem, also known as the tournament principle. Many a sports fan has lamented 
the fact that the relation of (one team) winning against (another team) is not transitive. 
If only! But if we can free ourselves of the emotion associated with such events in 
real life, we can notice some interesting related mathematics questions. For example, 
given N € w, how many teams would have to play in a round-robin tournament 
(assuming no ties) to guarantee that there are at least N many teams among which 
winning is transitive (i.e., if Team A beats Team B, and Team B beats Team C, 
then Team A also beats Team C)? In combinatorics, this and related problems have 
received a great deal of interest. The name of what has come to be called the Erdés— 
Moser theorem in reverse mathematics derives from a pair of influential papers by 
Erdés and Moser [99, 100]. 
Recall the definition of a transitive coloring from Definition 9.2.6. 


Definition 9.5.1 (Erdés—Moser principle). EM is the following statement: for every 
coloring c: [w]? — 2 there exists an infinite set 7 C w such that c [ [7]? is transitive. 
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A tournament is perhaps most easily seen as a directed graph, with nodes represent- 
ing teams (or players), and arrows indicating who won against whom. A 2-coloring 
represents the same information, but highlights the immediate connection of EM 
to Ramsey’s theorem. More precisely, EM is just a subproblem of RT2. As such, it 
inherits admitting 1 solutions. Kach, Lerman, Solomon, and Weber (unpublished; 
see [195]) showed that it omits x solutions. So, with respect to the arithmetical 
hierarchy, EM behaves just like RTS. Dzhafarov, Kach, Lerman, and Solomon (un- 
published; see [195]) also showed that EM omits solutions of hyperimmune free 
degree, which is also like AL, 
Over RCAg, things are more interesting. 


Proposition 9.5.2 (Folklore). RCAo + RT5 @ EM+ ADS. 


Proof. For the nontrivial direction, fix an instance c: [N]? > 2 of RTs. Apply EM 
to find an infinite set T such that c [[T]? is transitive. By Proposition 9.2.7, we can 
view c | [7]? as an instance of ADS. Let H C T be any ADS-solution to this instance. 
Then c }[T]?, and hence also c, is constant on [H]?. Thus, H is homogeneous for c 
and we are done. oO 


The previous result gives a new decomposition of RT3, akin to the Cholak— 
Jockusch-Slaman decomposition (Theorem 8.5.1). (Note that, as in Theorem 8.5.4, 
what the proof actually shows is that RTS <w EM « ADS.) This decomposition 
also turns out to be proper. We already know that ADS does not imply RTS (Theo- 
rems 9.2.21 and 9.2.22), so it does not imply EM. The question whether EM implies 
ADS was open for many years. 


Theorem 9.5.3 (Lerman, Solomon, and Towsner [195]). ADS <,, EM. Hence, 
also RCAg Xk EM => ADS. 


The proof in [195] actually showed that SRT; <. EM. This was done by an iterated 
forcing construction, similar in style to the original separation of CAC from ADS 
(Theorem 9.2.22), but requiring a deep analysis of a very different set of combi- 
natorics. Subsequently, both Wang [322] and Patey [243] were able to improve on 
this to give even stronger separations. Recall the notion of a problem admitting 
preservation of hyperimmunity from Definition 9.1.14. 


Theorem 9.5.4 (Patey [243]). EM admits preservation of hyperimmunity but SADS 
does not. 


In fact, Patey [243] showed that SADS does not even preserve two hyperimmune sets 
(i.e., 2 among 2 hyperimmunities, in the parlance of Definition 9.1.12). Hence, by 
Theorem 4.6.13, we have SADS €,, EM. Wang [322] used a different preservation 
property to obtain the same result. 


Definition 9.5.5 (Wang [322]). Let I denote one of £2, 11°, or A° for some n € w. 
A problem P admits preservation of T definitions if for every A € 2 and every 
Z € 2@ which is properly I relative to A, every A-computable instance X of P has 


a solution Y such that Z is properly I relative to A @ Y. 
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The use of “properly” above is important (see Definition 2.6.1) as otherwise the 
definition is trivial. 


Theorem 9.5.6 (Wang [322]). EM admits preservation of Me definitions but SADS 
does not. 


Actually, Wang [322] showed that a large number of principles admit preservation 
of AS definitions, including COH and WKL. So by Exercise 4.8.12, SADS is not 
w-model reducible even to, say, WKL + COH + EM. 

SADS is one of the weakest principles we have studied so far, so the fact that 
EM does not imply it tells us that the latter is somewhat weak as well. However, 
EM does have some strength. In addition to Theorem 9.5.2, we have a version of 
Theorem 9.3.10 for EM. 


Theorem 9.5.7 (Wang; see [175]). RCAg + EM > DNR(@’). 
On the first order side, we also get an implication to Bx. 
Theorem 9.5.8 (Kreuzer [187]). RCAg + EM > BX). 


Proof. We use Hirst’s theorem (Theorem 6.5.1) and prove that EM — RT!. Arguing 
in RCAo, fix c: N > k, k > 1. Define d: [N]? — 2 by letting d(x, y) be 1 or 
0 depending as c(x) = c(y) or c(x) # c(y), respectively. Apply EM to find an 
infinite set T such that c [[T]? is transitive. For all x < y < zinT, if c(x) # c(y) 
and c(y) # c(z) then d(x, y) = d(y,z) = 0. Hence, we must have d(x, z) = 0 by 
transitivity, so c(x) # c(z). It follows that c colors the elements of T in intervals: if 
c(x) =i < k and c(y) #i for some y > x then c(z) # i for all z > y. By bounded 
x? comprehension, we can form the set F = {i < k : (Av € T)[c(x) = iJ}. Let 
W ={xe€T:c(x)€ FA (Vy < x)[y € T > c(y) # c(x)]}. Then W exists, and 
it must be finite. (If not, fix x9 < --- < x, in W. By definition, c is injective on 
{xe : € < k}. But then this is an injection of a set of size k + 1 into a set of size 
k, contradicting Proposition 6.2.7.) Let x be the largest element of W. Then by our 
observation, we have c(y) = c(x) for all y > x in T. In particular, {y € T : y > x} 
is an infinite homogeneous set for c. oO 


9.6 The Chubb—Hirst—McNicholl tree theorem 


A different variant of Ramsey’s theorem is obtained by changing the structure on 
which colorings are defined. Especially interesting here are trees. 


Definition 9.6.1. Fix numbers n, k > 1 andaset T C 2“. 


1. [T|” = {(oo, - . ->On-1) eT": Oo <ctt < On-1}. 
2. A k-coloring of [T|" isa map c: [T]" > k. 
3. Aset S € T is homogeneous for c: [T]" — k if c is constant on [S]”. 
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We follow all the same conventions for the above definition as we do for the analogous 
definition for colorings of (subsets of) [w]” (Definition 3.2.4). 

We will shortly justify why we are restricting to colorings of comparable nodes 
here. First, we define the kinds of sets we want to be homogeneous. 


Definition 9.6.2. A set T ¢ 2<“® is isomorphic to 2<®, written T = 2“, if (T, <) 
and (2<®,7) are isomorphic structures (i.e., there is a bijection f: T — 2<® such 
that for all o, t € T, o < 7 if and only if f(o~) < f(7)). 


Intuitively, the homogeneous object should “look like” the object used to define the 
coloring. Here this is 2“, but it is exactly the same in the case of Ramsey’s theorem. 
There, a homogeneous object is not just any set, but an infinite set, i.e., one which, 
together with <, is isomorphic to (w, <). 

We can now state the following tree version of Ramsey’s theorem, originally 
introduced by Chubb, Hirst, and McNicholl [45]. 


Definition 9.6.3 (Chubb—Hirst—McNicholl tree theorem). 


1. For n,k > 1, TTY is the following statement: every c: [2*°]" — k has a 
homogeneous set T = 2<®. 

2. Forn > 1, TT" is (Vk)TTZ. 

3. TT is (Vn) TT”. 


In the reverse mathematics literature, TT is sometimes called “Ramsey’s theorem 
on trees”, but more commonly simply the “tree theorem”. We will refer to it here 
either by its abbreviation or as the “Chubb—Hirst-McNicholl tree theorem’, so as 
to distinguish it from another principle, Milliken’s tree theorem, which we discuss 
below. 

The reason [7]” is defined to consist only of n-tuples of comparable nodes of 
T, rather than arbitrary size-n subsets of T, is that the latter would render the tree 
theorem false. Consider, e.g., the coloring that assigns a pair of nodes (o,T) the 
color 0 if o | t, and the color 1 otherwise. This coloring is not constant on the 
(comparable and incomparable) pairs of any T = 2<“. We will return to this issue 
at the end of the section. 

Perhaps the most obvious first question in any investigation of TT is how it 
compares to RT. It is easy to see that for all n,k > 1, RT; <sw TT; Indeed, fix 


c: [w]" — k and define d: [2<°]" — k as follows: for all a < --- < oy_) in 
2<”, let 

d(a0, tee »On-1) = c(lool, oe) |On-11). 
Note that (|go|,.--,|On-1|) € [w]” since |ao| < +++ < |on-1|. (The coloring d is 


called the level coloring induced by c.) Now let T = 2*@ be any homogeneous set 
for d, and let (a; : i € w) be the sequence of lexicographically least elements of T 
Then H = {|o;| : i € w} is an infinite homogeneous set for c. 

The argument can be formalized to also show that TT; — RT} over RCAo. (The 
only point requiring a bit of justification is the existence of the sequence of o;, and this 
is Exercise 9.13.13.) Thus TT; inherits all the lower complexity bounds of Ramsey’s 
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theorem. Chubb, Hirst, and McNicholl [45] showed that, in terms of the arithmetical 
and jump hierarchies, it also has the same upper bounds, and indeed behaves very 
similarly. However, because of the tree structure, some of the proofs—like that of 
(1) below—are surprisingly more complicated. 


Theorem 9.6.4 (Chubb, Hirst, and McNicholl [45]). 


1. For all n > 1, TT" admits 11° solutions and ACAg + TT". 

2. For alln > 2, TTS omits X? solutions and RCAg + ACAg © TT. 

3. For alln > 3, TT} codes the jump and RCAg t ACA <> TTS < TT”. 
4, RCAg + (Wn, k > 1)[TTE > RX]. 

5. RCAg F IZ) > TT! > BEY. 


Historically, the first evidence that the Chubb—Hirst-McNicholl tree theorem 
behaves differently from Ramsey’s theorem was the following result showing that 
Hirst’s theorem does not lift to the tree setting. The proof utilizes a very clever model 
theoretic argument to translate a failure of induction into a combinatorial advantage. 


Theorem 9.6.5 (Corduan, Groszek, and Mileti [56]). RCAg ¥ BZ) > TT!. 


Proof. The main insight is that there exists a computable map f: w x 2<° — 2 
with the following property: for each n € w and e < n, if ®, is a computable set 
T = 2<® then there exist incomparable nodes oo, 0 € T such that for eachi < 2, 
f(n,t) = i for all t € T extending o;. This is proved by a finite injury argument 
with a computably bounded number of injuries (see Exercise 9.13.15). As discussed 
in Section 6.4, this can be formalized in RCAo. 

We will use this to construct a model of RCAp + Bx? + aTT!. Start with model 
M of RCAo + Bx? + aIx2. Fix a b € M and a function g: (M -b) x M > Mas 
given by Exercise 6.7.5. So there is a proper cut J C M such that M § lims g(a, s) 
exists for all a € J, and the values of these limits are unbounded in M. Also, g is 
A‘-definable in M. Let A be the join of the set parameters in this definition, so that 
g is actually A?(A)-definable (with no other set parameters). 

Let f be as in the observation above, relativized to A. Notably, f is also A°(A)- 
definable in M. We define a A(A)-definable instance c of TT! as follows. Given 
ao €2<™ let c(c) be the (code of) the sequence 


(f(g(a,lol),0) 2a <™ b). 


Thus, c is a 2?-coloring. 

Now, consider the structure M* obtained from M by restricting the second order 
part to those § ¢ S™ that are A°(A) in M with no other parameters. This is an 
w-submodel of M, so by Theorem 5.9.3, every i, sentence true in M is also true in 
M\*. In particular, M* — RCAp + BxS. Also, the defining properties of both g and f 
are I, and so hold in M*. We claim that c has no solution in S””. Since c €e S 
it follows that M* ¢ TT!, which completes the proof. 

To prove the claim, consider any T in S’. By definition, T is A-computable (in 
the sense of MM*); hence, there is an e € M such that T = 04 in M*. Since the values 
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of lim, g(a, s) are unbounded on J, we can choose a € I with lim, g(a,s) =n >™ e. 
If T = 2<™, choose oo, a1 € T as in the definition of f for this n and e. Also, fix so 
so that g(a, s) =n forall s >” so, and for eachi < 2, fix t > oj inT with |t;| > so. 
Then for each i < 2 we have f(g(a, |7;|), 7) = f(n,7;) = i, so f(g(a, |tol), 70) # 
Sf (g(a, |™|), 7). But then c(t 9) # c(t), so T is not homogeneous for c. Since T 
was an arbitrary element of S’, this proves our claim. oO 


The reader may wish to ponder why, exactly, the above argument cannot be replicated 
for RT! in place of TT!. (Of course, we know Bx? does imply RT'.) Can the argument 
be adapted to work for RT3? 

Using different model theoretic techniques, it is possible to obtain further results 
concerning the strength of TT'. The following theorem complements the previous 
one to establish that both implications in part (5) of Theorem 9.6.4 are strict. 


Theorem 9.6.6 (Chong, Li, Wang, and Yang [35]). RCAy ¥ TT! — 133. 
And the following provides an extension of Theorem 8.7.20. 


Theorem 9.6.7 (Chong, Wang, and Yang [43]). WKL+RT3+TT! is II-conservative 
over RCAg + Bx. 


On the second order side, parts (1) and (4) of Theorem 9.6.4 immediately raise 
the tantalizing possibility that TH, might lie strictly between ACAg and Ale This is 
compelling, since RT? is the strongest of the principles we have seen so far that lie 
strictly below arithmetical comprehension. The first step in “dethroning” RTS from 
this position was the following separation. 


Theorem 9.6.8 (Patey [242]). TT} <.. RT}. Hence, also RCAg ¥ RT} > TT5. 


The second step was to establish a version of Seetapun’s theorem for trees. This was 
obtained somewhat later. 


Theorem 9.6.9 (Dzhafarov and Patey [89]). Ti admits cone avoidance. Hence, 
also RCAg ¥ TT; > ACAo. 


And later still, it was even shown that a version of Liu’s theorem (Theorem 8.6.1) 
for trees holds as well. 


Theorem 9.6.10 (Chong, Li, Liu, and Yang [42]). TH admits PA avoidance. 
Hence, also RCAg ¥ TT; > WKL. 


Theorem 9.6.8 is obtained using a carefully designed preservation property based 
on the fact that, in any T = 2“, it is possible to do “different things” above 
each of a given pair of incomparable nodes. (Essentially, this is also the basis for 
the combinatorial ingredient in the proof of Theorem 9.6.5 above.) The proofs of 
Theorems 9.6.9 and 9.6.10 both rely on an analogue of the Cholak—-Jockusch-Slaman 
decomposition for trees, and in particular, on suitably formulated stable and cohesive 
versions of Ti. (Dzhafarov, Hirst, and Lakins [84] showed that there are actually 
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several possible candidates for what a stable form of TE should look like, and each 
of these produces its own cohesive version. See Exercise 9.13.14.) In the case of 
Theorem 9.6.9, strong cone avoidance for TT} is established, and then cone avoidance 
for a cohesive form of Th. The combinatorial core of the argument features a use of 
so-called forcing with bushy trees, originally developed by Kumabe [189, 190] and 
used widely in the study of the Turing degrees and in algorithmic randomness. (For a 
nice survey of this method, see Khan and Miller [178].) The proof of Theorem 9.6.10 
relies on more intricate preservation properties of the stable and cohesive forms of 
TH: and in particular, gives a new and somewhat more modular proof of Liu’s 
theorem. 


9.7 Milliken’s tree theorem 


There is another kind of “Ramsey’s theorem on trees”, of which TT is actually 
just a special case. This is Milliken’s tree theorem, originally proved by Milliken 
in [214], and later refined in [215]. A slightly more modern proof appears in the 
book of Todoréevié [310]. While the Chubb-Hirst-McNicholl tree theorem (TT) of 
the preceding section was invented directly in the reverse mathematics literature, 
Milliken’s tree theorem has been studied extensively in descriptive set theory and 
combinatorics. At first glance, it may perhaps seem as an oddly technical result, 
but in fact it is incredibly elegant. Importantly, it is also a natural generalization of 
a number of partition results, including Ramsey’s theorem and many others. Such 
generalizations are a main object of study in structural Ramsey theory (see NeSetril 
and Rédl [231]). 
To state Milliken’s tree theorem, we begin with preliminary definitions. 


Definition 9.7.1. Fix T C w<”. 


1. T is rooted if there is an a € T such that a < f# forall B eT. 

2. T is meet-closed if for all a, 6 € T, the longest common initial segment of a@ and 
B belongs to T. 

. For a € T, the level of a in T, denoted lvl7 (a), is |{t € T : B < a}. 

. For eachn € w, T(n) = {a € T: llr (a) = n}. 

. The height of T, denoted ht(T), is sup{n > 0: T(n — 1) # O}. 

. Fork € w,a@ € Tis k-branching in T if there exist exactly k many distinct 8 > a 
in T with lvl7 (8) = lly (a) + 1. 

.@ €T isaleaf of T it is 0-branching in T. 

. T is finitely branching if every a € T is k-branching for some k € w. 


NM & W 


on 


So, structurally, rooted meet-closed sets “look like” finitely-branching trees, even 
though they do not have to be trees at all in our usual sense. (In fact, in the literature 
these sets are often referred to as “trees” for simplicity, but we will not do so here to 
avoid any possibility of confusion.) 
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Figure 9.3. An illustration of several subsets, Sp, S,, and S, of T = 2+. T is represented by 
thin gray lines. The elements of each S; are the solid nodes connected by heavy black lines. So is 
not a strong subtree of T because it is not meet closed. Sj is not a strong subtree because it fails 
condition (1) in Definition 9.7.2. Only S2 is a strong subtree of T; Sz € S2(T'). Each of the S; is 
isomorphic to 2<? as structures under <. 


Definition 9.7.2. Let T C w<“® be rooted, meet-closed, and finitely branching. A set 
S € T isa strong subtree of T if it is rooted, meet-closed, and the following are true. 


1. There exists f: ht(S) — ht(T) so that Ilvlr (a@) = f(lvls(q@)) for alla € S. 
2. For all a € S and k € w, if @ is k-branching in T and Ivl7 (qa) + 1 < ht(S) then 
a is k-branching in S. 


For 7 € w U {w}, S(T) is the set of all strong subtrees of T of height 7. 


The defining property of the function f above can be restated as follows: nodes at 
the same level in S must come from the same level in T. See Figure 9.3. Notice that 
if T € S,,(2*“) then in particular T = 2<®. 


Definition 9.7.3 (Milliken’s tree theorem). 


1. Forn,k > 1, MTT; is the following statement: if T C w<“ is rooted, meet- 
closed, and finitely branching, with height w and no leaves, then for every 
c: S,(T) — k there exists S € S,,(T) such that c is constant on S,,(S). 

2. For n > 1, MTT” is (Vk)MTT?. 

3. MTT is (Vn2)MTT”. 


Itis not difficult to see that MTT; implies TT; for all n and k (see Exercise 9.13.17). 
Hence, in particular, MTT; implies RT; and most of the principles below ACAg that 
we have seen so far. This reflects the aforementioned fact that, combinatorially, it is 
a common generalization of these and many other partition results. 

The problem of finding the complexity of MTT from the point of view of com- 
putability theory and reverse mathematics was first posed by Dobrinen [68] and 
by Dobrinen, Laflamme, and Sauer [69]. One motivation for this has to do with 
how Milliken’s tree theorem is proved. Namely, every proof of MTT actually proves a 
seemingly stronger product form of Milliken’s tree theorem (which we define below). 
It is natural to wonder if this is necessarily so, and of course reverse mathematics 
lends itself well to this analysis. Indeed, we will see that the answer is yes. First, we 
need some additional definitions. 
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Definition 9.7.4. Fix d > | and let Jo,..., Ta_1 C w<@ be rooted, meet-closed, and 
finitely branching. For 7 € wU {w}, S, (To, ...,Ta-1) is the set of all (So, ...,Sa-1) 
such that S; € S,(7;) for all i < d, and the function f witnessing this (from 
Definition 9.7.2 (1)) is the same for all i. 


Definition 9.7.5 (Product form of Milliken’s tree theorem). 


1. Forn, k > 1, PMTT} is the following statement: fix d > 1 and let To,...,Ta-1 © 
w<® be rooted, meet-closed, and finitely branching, with height w and no 
leaves. Then for every c: S,(7o,...,Ta-1) — k there exists (So,...,Sqa-1) € 


Sw (To,...,Ta-1) such that c is constant on the set S,,(So,..., Sg-1). 
2. Forn > 1, PMTT” is (Vk)PMTT?. 
3. PMTT is (Va)PMTT”. 


Thus, Milliken’s tree theorem is just the product form of Milliken’s tree theorem 
with d = 1. As pointed out in [214], the principle PMTT! was first proved by 
Laver (unpublished), and then Pincus [248] showed it to be a consequence of an 
earlier combinatorial theorem due to Halpern and Latichli [135]. As a result, PMTT! 
is commonly referred to either as the Halpern—Laiichli theorem or the Halpern— 
Latichli-Laver—Pincus (HLLP) theorem. 

It may look like PMTT is just the parallelization of MTT (in the sense of Defini- 
tion 3.1.5), but the requirement in the definition of S, (Zo, ...,Ta-1) of a common 
level function significantly complicates the combinatorics. Indeed, already proving 
PMTT! seems to encompass most of the combinatorial complexity of (the full) Mil- 
liken’s tree theorem. This is quite a contrast from, say, RT! and (the full) Ramsey’s 
theorem. However, it turns out that in terms of computational complexity the situa- 
tions are the same. The first part of the following result was independently noted by 
Simpson [285, Theorem 15]. 


Theorem 9.7.6 (Anglés d’Auriac, Cholak, Dzhafarov, Monin, and Patey [61]). 
PMTT! admits computable solutions. In fact, for every A € 2<®, every A-computable 
instance of PMTT! theorem has an A-computable solution whose index can be found 
uniformly arithmetically in A. 


Much like in standard proofs of Ramsey’s theorem (e.g., Lemma 8.1.2) PMTT 
can be proved by induction. The inductive basis is PMTT!, and the inductive step, 
that PMTT"*! follows from PMTT”, is obtained by applying PMTT! again. This 
immediately yields part (1) of the following corollary. Combining this with earlier 
results then yields parts (2) and (3). 


Corollary 9.7.7 (Anglés d’Auriac, Cholak, Dzhafarov, Monin, and Patey [61]). 


I. Foralln,k > 1, ACAg + PMTTy. 
2. For all n > 3, RCAg t ACA © PMT” © MTT” © TT” © RT”. 
3. RCAg t ACA’ @ PMTT © MTT @& TT © RAT. In fact, PUTT =w MTT. 


Part (3) helps explain why a proof of MTT that is not also a proof of PMTT is difficult 
to find. By the metrics of reverse mathematics, the two principles have the same 
combinatorial cores. 

Once again, the n = 2 situation turns out to be different. 
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Theorem 9.7.8 (Anglés d’Auriac, Cholak, Dzhafarov, Monin, and Patey [61]). 
PMTT? admits cone avoidance. Hence, also RCAg X PMTT? ACAo. 


The proof follows the by-now familiar pattern. First, we formulate a stable version of 
PMTT? and show that every A-computable instance of the stable version yields an A’- 
computable instance of PMTT! with the same solutions (see Exercise 9.13.18). This is 
exactly analogous to the relationship between D” and RT! given by Propositions 8.4.4 
and 8.4.5. Second, a cohesive version is formulated, which is simply the following: 
for all k > 1 and all instances c: S2(7o,...,Ta-1) > k of PMTT;. there exists 
(So,--.-,Sd-1) € Sw(To,...,Ta-1) such that c | S2(So,...,Sg¢-1) is stable. The 
main work then falls to showing that this cohesive version admits cone avoidance, 
and that PMTT! admits strong cone avoidance. 

A number of questions remain open around PMTT? and MTT’. In particular, the 
root of Dobrinen’s question about the strength of Milliken’s tree theorem remains 
open for colorings of strong subtrees of height 2. 


Question 9.7.9. Is it the case that RCAy + MTT? — PMTT?? How do PMTT? and 
MTT? compare under reducibilities finer than <.? 


Likewise the precise relationship between MTT and TT? is unknown. 
Question 9.7.10. Is it the case that RCAg t TT? > MTT?? 


Finally, while we have an analogue of Seetapun’s theorem for Milliken’s tree theorem, 
it is open whether the analogue of Liu’s theorem holds as well. 


uestion 9.7.11. Over RCAo, does MTT? (or even PMTT?) imply WKL? 
ply 


9.8 Thin set and free set theorems 


In this section, we consider the thin set theorem and free set theorem, both of which 
are interesting consequences of Ramsey’s theorem. These were first described in the 
context of reverse mathematics by Friedman [110] (see also Simpson [287]) and 
later expounded in an open questions survey by Friedman and Simpson [116]. The 
first deep dive into these questions came a bit later, by Cholak, Giusto, Hirst, and 
Jockusch [32]. 


Definition 9.8.1. Fix n > 1, X,Y € 2°, anda coloring c: [X]" > Y. 
1. Aset Z C X is thin for c if c is not surjective. 


2. A set Z C X is free for c if for all x € [Z]”, if f(x) € Z then f(x) € x. 


Definition 9.8.2 (Thin set theorem, free set theorem). Fix n > 1. 


1. TS%, is the following statement: every c: [w]” — w has an infinite thin set. 


2. FS", is the following statement: every c: [w]” — w has an infinite free set. 
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We note that some texts, e.g., [32] and [147], use the alternate abbreviations TS”, 
FS", TS(n), and FS(n). As we will see, the above notation can be more easily 
generalized in a manner that is consistent with other principles. In the case of the 
free set theorem, it also makes it easier to discern from the notation FS(X) that will 
be defined in the next section. 

One example of a “free set” in mathematics is a set X of linearly indepen- 
dent vectors in a vector space. This has the property that if xo,...,%,-1 € X and 
c(xq,...,Xn—1) is a linear combination of these vectors that is also in X, then nec- 
essarily c(xo,..-,Xn-1) € {X0,---,Xn-1}, by linear independence. FS can thus be 
seen as some kind of combinatorial distillation of this and similar notions of “inde- 
pendence”. (For some other examples, see [32]. A related discussion also appears in 
Knight [182].) In contrast, TS may appear to be somewhat less motivated. The two 
nonetheless have much in common. 


Theorem 9.8.3 (Friedman [110]; also Cholak, Giusto, Hirst, and Jockusch [32]). 
Fixn > 1. 


1, RCAg + RT! — FS", > TS. 
2. RCAg + FS"! | FS". 
3. RCAg + TS”*! — TS". 


Proof. We prove the implication RTS — FS%, in part (1) and leave the other impli- 
cations for the exercises. We argue in RCAg + RT. Fix c: [N]” — N, an instance of 
FS". 

Given x € [N]"”, let k(x) denote the least k < n, if it exists, such that c(x) < x(k). 
Now by recursion, define x° = x, and having defined x” € [N]” for some m, if i(x’") 
is defined then define x”"*! € [N]” with 


zn (Ry = ee if k # k("), 
c(x™) ifk = k(x") 

for all k < n. Hence, x”"*!(k(x")) < x""(k(x")). Let m(x) be the least m such that 
k(x) is not defined or k(x’) # k(x). Note that the set of m for which k(x”) is 
defined and equal to k(x) is p-definable and therefore exists. If this set is empty, 
then m(x) = 0. Otherwise, m(x) exists by Lee. 

We now define an instance d of RTS, 49» which is a consequence of RT?, as follows: 
given x € [N]”, let 


0 if c(x) € x, 
d(x) =41 if c(x) > x, 
2k+b ifc(%) ¢€¥Ac(X) ¥XAK=K(X) Am(X) = (mod 2). 
Note that d is well-defined: if c(x) ¢ x and c(x) # x then k(x) is defined and so is 


m(x). Let H be any infinite homogeneous set for d, say with color i < 2n + 2. We 
break into cases according to the value of 7. 
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Case 1: i = 0. In this case, H is a free set for c. 


Case 2: i = 1. We define an infinite set X = {xq < x; < ---} C A recursively as 
follows. Let x9 = min H. Now fix s > 0 and assume we have defined x; for all t < s. 
Let x, be the least x larger than x;_; and c(x) for every x € [{x; : t < s}"]. (Since 
the latter set is finite, the range of c restricted to it is finite by Proposition 6.2.4, 
and so x exists because H is infinite.) This completes the construction. Now fix any 
x € [X]”. We claim that c(x) ¢ X, whence it follows that X is free for c. Suppose 
not. Say c(x) = xs. Since X is a subset of H, it is still homogeneous for d with 
color 1, and so c(x) > x. It follows that x € [{x,; : t < s}]”. But by construction, 
Xs is larger than c(y) for all y € [{x; : t < s}]”. In particular, x, > c(x) = Xs, a 
contradiction. 


Case 3: i > 1. We claim that H is free for c. To see this, consider any x € [H]” and 
suppose c(x) € H. Since we are in Case 3, k(x) and m(x) are defined. And since x € 
[H]” and c(x) € H, we must also have x! € [H]”. By homogeneity of H, it follows 
that d(x) = d(x'), which means that k(x) = k(x!) and m(x) = m(x!) (mod 2). But 
(x!)™ = x"! for all m, so clearly m(x) = m(x!) + 1. Hence, no x € [H]” can have 
c(x) € H. o 


Corollary 9.8.4. 


1. RCAg + FS}, and for all n > 2, ACAg + FS2,. 
2. ACAS F FSw. 


Note that the proof above actually shows that FS, <, RT3 


one Lhe following question 
is open: 


Question 9.8.5. Is it the case that FS%, <c RT’; for some j < 2n +2? 


Friedman [110] showed separately that FS’, <u sakes Since RT3 — ACA > 
Al this yields an alternative proof of the above theorem for n > 3. 

A minor modification to the proof of Theorem 9.8.3 can be used to show that the 
solution produced there can be chosen to be II°. By a separate argument similar to 
the proof of Theorem 8.2.2, it is not difficult to build a computable instance of TS’, 
with no A® solution. Hence, the free set theorem and consequently also the thin set 
theorem enjoy the same arithmetical bounds familiar to us from Ramsey’s theorem 


and, by now, many other principles. 
Theorem 9.8.6 (Cholak, Giusto, Hirst, and Jockusch [32]). 


1. For every n > 1, FS", admits 11° solutions. 
2. For every n > 1, TS", omits pia solutions. 


Surprisingly, the free set theorem is much more closely related to the rainbow 
Ramsey’s theorem than to Ramsey’s theorem itself. The first result showing this was 
by Wang, which has an immediate and important corollary. 


Theorem 9.8.7 (Wang [321]). For all n > 1, RRT% is uniformly identity reducible 
to FS%, and RCAg + FSt, — RRTS. 
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Corollary 9.8.8. (Wn)FS%, admits strong cone avoidance. Hence, TJ €« FS”, and 
RCAo ¥ FS, — ACAo. 


Proof. Immediate by Theorem 9.4.4. oO 
Later, Patey obtained a partial reversal to Wang’s theorem. 

Theorem 9.8.9 (Patey [240]). For all n > 1, RCAy  RRT3"*! — FSz,. 
Corollary 9.8.10. (Vn)RRT> =. (Vn)FS%,. 

It remains open whether this equivalence extends to arbitrary models. 

Question 9.8.11. Is it the case that RCAg + (Vn)RRT> << (Vn)FS%,? 


It also remains open how close the free set theorem and thin set theorem are to 
one another. The combinatorics of the two theorems are very similar, which has thus 
far frustrated all attempts to separate them. 


Question 9.8.12. Is it the case that RCAo + (Wn)[TS%, — FS]? For each n, does 
there exist an m > n such that RCAg + TS” — FSi)? 


In light of Theorem 9.8.9, the following result may be viewed as a partial step towards 
a positive answer to the second question. 


Theorem 9.8.13 (Patey [240]). RCAo + TS2, > RRT5. 


This is currently the only nontrivial lower bound on the strength of TS2, in terms 
of implications over RCAo, extending a prior result by Rice [257] that TS2, implies 
DNR. 

There is a natural generalization of TS’, that was originally considered by Dorais, 
Dzhafarov, Hirst, Mileti, and Shafer [72]. 


Definition 9.8.14 (Thin set theorem for finite colorings). For n > 1 and k > 2, 
TS{. is the following statement: every c: [w]” — w has an infinite thin set. 


The indexing here works opposite to that of Ramsey’s theorem: if k > /, then it is 
TS; that is prima facie weaker than TS*. 


Proposition 9.8.15 (Dorais, Dzhafarov, Hirst, Mileti, and Shafer [72]). Fixn > 1 
andk > j >2. 


1. TS) is uniformly identity reducible to TS". 
2. TS%, is uniformly identity reducible to TS;.. 
3. RCAg + TSU, > TS? > TSz. 


The implications are strict. 


Theorem 9.8.16 (Patey [243]). Foralln > 1 andk > j > 2, TSi; kw TSy. Hence, 
also RCAg ¥ TS; > TS). 
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Compare this with Proposition 4.3.7 and Theorems 9.1.1 and 9.1.11 for Ramsey’s 
theorem, where the principles RT; for different values of k are equivalent over RCA 
but can be separated under finer reducibilities. 

Notice that TS} is exactly RT5, so Theorem 9.8.16 actually gives us a strictly 
descending sequence of principles weaker than RT?, 


RTS =1S5 > TS} > TS) > --- > TSG. 


This reveals something curious. On one end of the above spectrum, the thin set 
theorem behaves like Ramsey’s theorem (because the two coincide). On the other 
end, it behaves quite differently. For example, for n > 3 we know TS) = RT4 is 
equivalent to ACAo, but even (Vn)TS%, is strictly weaker. For any such behavior, we 
can ask where the demarcation occurs, whether between k = 2 and k > 3, between 
k € wand w, or—the most interesting possibility—somewhere in-between. It turns 
out that coding the jump is an example of this latter case. 


Proposition 9.8.17 (Dorais, Dzhafarov, Hirst, Mileti, and Shafer [72]). For all 
n> 1, RCAg + TS3t? — ACAo. 


Proof. We argue in RCAg + TS}; >| Fix an injection f : N — N. We show range(f) 


exists. For each k <n, define c,: N”*2 — 2 as follows: given x € [N] n+2 Jet 


2 ( if (Az) [¥(k) < y < ¥(k +1) A f(y) < X(0)], 
CK(X) = 
OQ otherwise. 
Now define c: N"*? = 2” by c(x) = (cx(x) : k < n). Apply Tees to find an 
infinite thin set X for c. Say (by : k <n) € 2” is such that c(x) # (by : k <n) for 
all x € [X]"**. Fix the largest kg < n such that for all k < ko and all m, there exists 
x € [XN [m, c0)]"*? with cx (X) = bx. (Note that ko exists because n is standard.) 
We claim that bz, = 1. Indeed, consider any ¥ = (xo,...,Xn+1) € [X]"*? with 
cx(x) = b; for all k < ko. Since f is injective, the set F = {y € N: f(y) < x(0)} 


: : s48 * * * 7 : * ™ 
is finite by Proposition 6.2.7. Choose Xk <* het 00° <¥hay i X with Xi, larger 


than x,. and max F. For k < ko let x; = xx, and let x* = (x9,...,474)) € Be 9 aie 
Then cx (x*) = by for all k < ko, and cx, (x*) = 0. By maximality of kg we cannot 
have cx, (x*) = by, hence by, = 1 as claimed. With this in hand, we can give a nm 
definition of the range of f, as follows. A number z € N belongs to range(f) if and 
only if for all x € [X]"*? with z < x(0) and cx (x) = bx for all k < ko there is a 
y < x(ko) such that f(y) = z. Indeed, if for some such x we had that f(y) = z fora 


y > x(ko) then we could choose Xhotl <-+++ <x", in X with y < Niel? let x7 = xx 
for all k < ko and x* = (xj,...,%,,) € [X]"*?, and obtain that c;,(x*) = 1 = Dy, 
a contradiction. oO 


By contrast, we have the following result related to Corollary 9.8.8. 


Theorem 9.8.18 (Wang [321]). For each n > | there is a k > 2 such that TS? 
admits strong cone avoidance. In particular, TS €w TSk and RCAg ¥ TS) — ACAo. 
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By Proposition 9.8.17, the k in Wang’s theorem must be larger than 2”. Wang himself 
showed that if k is at least the nth Schroder number, S,, then this is large enough. 
The Schréder numbers are defined by the following recurrence relation: 


So=1, 
Sntt = Sn +)" SeSn-k-1. 
k<n 


It can be checked that S, > 2” for all sufficiently large n, so Wang asked whether 
the number of colors k above can be characterized exactly. Amazingly, the answer is 
yes! The characterization is in terms of the Catalan numbers, dy, which are defined 
using the recurrence 


do = 1, 
d= > Ged: 
k<n 


Theorem 9.8.19 (Cholak and Patey [30]). Fix n > 1. 


1. TS) admits strong cone avoidance if and only if k > dp. 
2. TS} admits cone avoidance if and only if k > dy-1. 
3. Ifk < dy- then TS¥. codes the jump and RCAg + TS — ACAo. 


For n = 2, this phenomenon is even more intriguing, since Ts, = RTS implies 
so many other principles while as already mentioned, TS*, seems to imply so few. 
There are many potential questions to investigate. For example, recently, Liu and 
Patey [199] showed that EM <,, is). and hence RCAg ¥ TS} — EM. The question 
of whether this is sharp is open. 


Question 9.8.20. Is it the case that RCAg + TS — EM? 


9.9 Hindman’s theorem 


We now turn to a partition result that behaves rather differently from Ramsey’s 
theorem and, indeed, is in many ways even more surprising. This is Hindman’s 
theorem, first proved by Hindman [143] as the resolution to a longstanding conjecture 
in combinatorics. As with Ramsey’s theorem, there are a number of different proofs 
of this result. Hindman’s original proof was a complicated combinatorial argument. 
Later, a simpler combinatorial proof was discovered by Baumgartner [12]. Galvin 
and Glazer (see [50]) gave a proof using idempotent ultrafilters. Blass, Hirst, and 
Simpson [17] found a way to make Hindman’s proof more effective. They used this 
to establish what is still the best upper bound on the strength of Hindman’s theorem 
in the context of reverse mathematics. We will see this result in Section 9.9.2. There, 
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we will use an elegant and greatly simplified proof of Hindman’s theorem due to 
Towsner [311]. 
We begin with definitions and the statement of Hindman’s theorem. 


Definition 9.9.1. For X C w, 
FS(X) = {x € w: (An > 1)(Ayo < +++ < yn-1 € X)[x = yot+++ + Yn-1]}- 


Worth stressing is that FS(X) is the set of all nonempty finite sums of distinct 
elements of X. We also repeat the caution from the previous section that this notion 
has nothing to do with the free set theorem studied there. 


Definition 9.9.2 (Hindman’s theorem). HT is the following statement: for every 


k > 1 and every c: w — kK, there is an infinite set J € w such that c is constant on 
FS(J). 


The set of finite sums from an infinite set is called an JP set, so Hindman’s theorem 
can also be stated as follows: every coloring of the integers is constant on some IP 
set. Despite this being a common terminology, we will avoid it here in favor of the 
more explicit FS notation. 

In spite of appearances, most similarities between Hindman’s theorem and Ram- 
sey’s theorem are superficial. Their internal combinatorics seem quite different. For 
example, as we will see in Section 10.6, Hindman’s theorem has a close relationship 
with topological dynamics; no “dynamical” proof of Ramsey’s theorem is known. 
Indeed, even the following “finite version” of Hindman’s theorem, proved some forty 
years earlier, required considerable techniques. 


Theorem 9.9.3 (Rado [252]). For every n € wand every c: w > k, k > 1, there 
isa set F C w with |F| =n such that c is constant on FS(F). 


As a taste, let us prove this result for n = 3 and k = 2. Although the argument 
is elementary, it is quite a bit less trivial than the analogous result for Ramsey’s 
theorem. Indeed, for colorings of n-tuples with n > 3, the latter is vacuous, and 
for colorings of singletons it is trivial. For colorings of pairs, the proof is easy and 
well-known: any six numbers contain among them three such that a given coloring 
of pairs is constant on all the pairs formed from these three numbers (a fact that can 
be summarized as R3(3) = 6, using the notation for Ramsey numbers introduced in 
Section 3.3). 


Proof (of Theorem 9.9.3 for n = 3 and k = 2). Without loss of generality, c(0) = 0. 
We may then also assume there are infinitely many x € w with c(x) = 0, since 
otherwise the result is trivial. Now, if there exists positive numbers x < y such that 
c(x) = c(y) = c(x + y) = 0, we can take F = {0,x, y}. So assume not. Fix positive 
numbers x9 < x1 < +--+ < x5 such that c(x;) = 0 for all i < 5 and such that x;.); — x; 
is different for each i < 5. Let dj = xj4; — x;. Since x; + d; = x;+; for alli < 4, our 
assumption implies that c(d;) = 1. More generally, since x; + dj + dist +++: dis; = 
Xi+j+1 for all 7,7 with i+ 7 < 4, we have that c(d; + dis; +--+ di+;) = 1. Notice 
also that we cannot have c(dp + d3) = c(d, + d4) = c(dp + d, + d3 + d4) = 0. 
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So, if c(dp + d3) = 1, we can take F = {do, d, + dz, d3}. If c(d, + d4) = 1, we 
can take F = {d,dz + d3,d4}. And if c(do + d, + d3 + d4) = O we can take 
F = {do + dj, dz, d3 + dg}. oO 


This proof is a bit cheeky, since we allow ourselves the use of 0. Still, it illustrates 
some of the subtle ways in which Hindman’s theorem is combinatorially more 
complicated than Ramsey’s theorem. In general, the additional relations imposed by 
sums are quite significant. 


9.9.1 Apartness, gaps, and finite unions 


An important property of HT-solutions is that they always contain subsequences 
of numbers that are very “spread apart’ in a certain technical sense. As we will 
see, one use of these is to encode information into such solutions. To make this 
precise, let us begin with some preliminary definitions and results. We have already 
discussed a number of ways of coding finite sets by numbers. Here, we will employ 
binary representations, as expressed in the following lemma. The proof is left to 
Exercise 9.13.10. 


Lemma 9.9.4 (Binary representation of the integers). The following is provable 
in RCAo. For every x € N, there exists a unique o € N<N with c(i) < o(j) forall 
i<j <|o| and such that x = Yiic\o| gou), 


Let us fix some notation to reflect this coding. 


Definition 9.9.5. The following definitions are made in RCAg. 


1. Let b: N — Pin(N) be the function which assigns to each x € N the range of 
the unique 7 € N<N given by Lemma 9.9.4. 

2. The function b has an inverse, b~!: Pgn(N) — N, defined by b7!(F) = 2 + 
+4 2%-1 for all F = {xq < +++ < Xp-1} € Prin(N). 

3. For x € N, let A(x) = min b(x) and p(x) = max b(x). 


The justification for the maps b and b~! actually being inverses (and hence also for 
our notation) follows by Lemma 9.9.4. The following basic facts concerning A and 
are easy to verify in RCAo. 


° For all x,y € N, if A(x) = A(y) then A(x + y) > A(x) +1. 
° For all x,y € N, if A(x) < A(y) then A(x + y) = A(X). 
* For all nonempty finite sets F < E, u(b7!(F)) < A(b7|(B)). 
The next lemma is slightly more subtle. There is a simple inductive proof, but it 


requires quantifying over infinite sets and hence requires induction for II; formulas. 
We give a more careful argument that avoids this. 


Lemma 9.9.6. The following is provable in RCAo + Bx3: for every m € N and every 
infinite I CN, there exists x € FS(1) with A(x) > m. 
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Proof. Fix k and I, and suppose every x € FS(J) satisfies A(x) < m. By BY, in the 
guise of RT!, there exists an £ < m such that A(x) = € for infinitely many x € FS(J). 
In particular, the set 


F={€<m: (Axo, x1 € FS(1)) [x9 < x1 A A(X0) = A(x1) = EJ} 


is nonempty. Note that F exists by bounded x comprehension. We claim that € < m 
belongs to F if and only if there are infinitely many x € FS(/) with A(x) = €. The “if” 
part is clear. For the “only if’, suppose € < k belongs to F but there are only finitely 
many x with A(x) = €. Then we can fix x9 < x; in FS(/) with A(x) = A(x1) = C, 
and such that A(x) # @ for all x > xo in FS(J). Let x2 = x; + (xo +x1). We have that 
A(x +x1) > € and hence A(x2) = €. But x2 > x;, so we have a contradiction to the 
maximality of x;. Thus, the claim holds. To complete the proof, fix the largest  < m 
in F. By the claim, we can fix a sequence x9 < x; <--- of elements of FS(/) with 
A(x;) = € for all i. For each i, let y; = x2; + X2;41. Then yo < yj <--+ is a sequence 
of elements of FS(/) with A(y;) > @ for all i. By a further application of BX2, there 
is an €* > € such that A(y;) = €* for infinitely many i. But then ¢* € F, contradicting 
the maximality of €. We conclude that there is an x € FS(/) with A(x) > m, as was 
to be shown. oO 


With this in hand, we can state the “apartness” property we mentioned, and prove 
that it is enjoyed by HT-solutions. 


Definition 9.9.7. 


1. A set X € w satisfies the apartness property if for all x < y in X, u(x) < A(y). 
2. HT with apartness is the statement of HT, augmented to include that the solution 
set I satisfies the apartness property. 


Corollary 9.9.8. 


1. For every infinite set I © w there is an infinite I-computable subset of FS(1) 
satisfying the apartness property. 

2. RCAo + BES proves that for every infinite set I C N there is an infinite subset of 
FS(J) satisfying the apartness property. 


Proof. Define a sequence of elements x) < x; < --- of FS()J) recursively as 
follows. Let x9 = min J, and given x; for some i € N, let x;4; be the least element 
of FS(/) such that x; > x;4; and p(x;) < A(xj41). The existence of x;4; follows by 
Lemma 9.9.6. oO 


Theorem 9.9.9. 


1. RCAg proves that HT is equivalent to HT with the apartness property. 
2. HT is Weihrauch equivalent to HT with the apartness property. 


Proof. For (1), we argue in RCAg + HT. Fix an instance c: N — k of HT. Apply HT 
to get a solution, 7. Now as RT! is a subproblem of HT, we have at our disposal Bz). 
Hence, we can apply Corollary 9.9.8 to find an infinite set J C FS(/) satisfying the 
apartness property. We have FS(J) € FS(J), hence c is constant on J. Part (2) is 
similar. Oo 
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The key point is that apartness really is natural to the combinatorics of Hindman’s 
theorem. We will see several applications of it, both in this section and again in 
Section 9.9.3. To begin, we will use apartness in an essential way to prove the 
following: 


Theorem 9.9.10 (Blass, Hirst, and Simpson [17]).. RCAg # HT — ACAo. 


As usual, the proof will be a formalization of the fact that HT codes the jump. 
However, as there are some small subtleties in the proof concerning induction, we 
will directly give the formal version. First, some definitions. 


Definition 9.9.11. The following definitions are made in RCAo, for a fixed function 
f: NON. Let b: N > Prn(N) be the map of Definition 9.9.5. 


1. A gap inx € wis a pair {u,v} C b(n) such that u < v and there is no w € b(x) 
with u <w<v. 

2. A short gap in x (with respect to f) is a gap {u,v} in x such that there exists 
n < win the range of f with f(s) #7 forall s < v. 

3. Avery short gap in x (with respect to f) is a gap {u, v} in x such that there exists 
n<uwith f(s) =n for some s < p(x) but f(s) #7 forall s < v. 

4. For x € N, SG(x) is the number of short gaps in n. 

5. For x € N, VSG(x) is the number of very short gaps in n. 


Proof (of Theorem 9.9.10). We argue in RCAp + HT. Fix an injective function 
f: N —N. We claim that the range of f exists. In what follows, all short gaps 
and very short gaps will be meant with respect to this f. 

Define c: N > 2 by c(x) = 0 if VSG(x) is even, and c(x) = 1 if VSG(x) is odd. 
Notice that c exists by AS comprehension, as the definition of VSG(x) is uniformly 
pa in x. Let J be a solution to c as given by HT with apartness. 

Consider any x < y in FS(J). Since / satisfies the apartness property, it follows 
that u(x) < A(y). Hence, the gaps in x + y are precisely the gaps in x, the pair 
{u(x), A(y)}, and the gaps in y. Also, every short gap in x or y is short in x + y. Thus, 
SG(x + y) is either SG(x) +SG(y) +1 or SG(x) +SG(y), depending as {u(x), A(y)} 
is or is not short in x + y. We claim that the former case is impossible, i.e., that 
{u(x), 4(y)} is never short in x + y. This will complete the proof, since then for all 
n € N we will have 


n € range(f) © (Vx, y € FS(x))[x < yAn < u(x) > (As < A(y) [ f(s) = n]], 


giving a mm? definition of the range of /. 

To prove the claim, we show that SG(x) is even for all x € FS(x). Thus, we cannot 
have SG(x + y) = SG(x) + SG(y) + 1 above, and the claim follows. So fix x. Let 
F = {n < p(x) : (As)[ f(s) = n]}, which exists by bounded xt comprehension. 
By Bo, which we have on account of having assumed HT, we can fix so > x such 
that for alln € F, f(s) =n for some s < so. And because / satisfies the apartness 
property, we can fix a y > s in J. Let us count which of the gaps in x + y are very 
short. Every very short gap in y is very short in x + y since u(y) = u(x + y). By 
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choice of y, {u(x), A(y)} is not very short (or even short) in x + y. But by contrast, 
every short gap in x is very short in x + y since u(x) < A(y) < u(y). Of course, no 
gap in x that is not short can be very short in x + y. Thus, we have 


VSG(x + y) = SG(x) + VSG(y). 


But since y,x+y € FS(J), we have c(y) = c(x+y), and so VSG(y) and VSG(x + y) 
have the same parity. We conclude that SG(x) is even. oO 


We conclude this section by mentioning an oft-stated variant of Hindman’s theo- 
rem, which is sometimes easier to work with. We call this version the finite unions 
theorem, and list it under the abbreviation FUT below. 


Definition 9.9.12. Fix X C w. 


1. Prin(X) is the set of nonempty finite subsets of X. 
2. For U C Prin(X), 


FU(U) = {E € Piin(X) : (An > 1)(AFo,..-, Fri € W)[E =|} Fil}. 


i<n 


3.U C Prin(X) satisfies the apartness property if for all E, F € U, either E < F 
orf <E. 


As with HT, we are looking at nonempty finite unions here. 


Definition 9.9.13 (Finite unions theorem). FUT is the following statement: for 
every k > 1 andevery c: Pyin(w) — k, there is an infinite set U © Pan(w) with the 
apartness property such that c is constant on FU(S). 


It is straightforward to formalize this definition in RCAp. One observation is that, 
with unions of sets, there is no way to get an analogue of Corollary 9.9.8, so the 
apartness property is explicitly part of the definition here. 


Proposition 9.9.14. 


I, RCAg + HT ¢ FUT. 
2. HT =w FUT. 


Proof. We argue in RCAo and prove (1). To show that HT — FUT, assume HT 
and let c: Pin(N) — &k be given. Let b: N — P(N) be the map defined in 
Definition 9.9.5. Define d: N — k by d(x) = c(b(x)) for all x, and let J be an 
infinite set so that c is constant on FS(/). By Theorem 9.9.9, we may assume J 
satisfies the apartness property. Let U = {b(x) : x € I}. Then Y satisfies the 
apartness property and c is constant on FU(U). 

For the converse, assume FUT and let c: N — k be given. Let b7! be the inverse 
map of Definition 9.9.5, and define d: Pin(N) — k by d(F) = c(b7!(F)) for all 
F € Pan(N). Apply FUT to get an infinite set 4 © Prn(w) with the apartness 
property such that d is constant on FU(U/). Let J = {b7-'(F) : F € UW}. If E, F are 
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finite sets with E < F then b-'(E U F) = b"'(E) + b7!(F). So by x0 induction 


on n, if Fo,..., F,-1 are finite sets such that, for each i < n—- 1, either F; < Fi 
or Fix, < Fi, then b7'(U;2, Fi) = Nien b7'(F)). It follows that c is constant on 
FS(J). im 


9.9.2 Towsner’s simple proof 


Hindman’s original proof of HT used a complex but purely combinatorial argument. 
Subsequently, several “high powered” proofs of Hindman’s theorem were obtained. 
One method uses ultrafilters and the topology of the Stone-Cech compactification 
of w; see Hindman and Strauss [145]. Another method uses the Auslander—Ellis 
theorem of topological dynamics; see Furstenberg [120]. We sketch a generalization 
of this dynamical proof in Theorem 10.6.15. These methods are difficult to formalize 
in second-order arithmetic although, as we will see, some progress has been made 
on formalizing the proofs via ultrafilters. 

The best known upper bound on HT, which we prove in this section, involves 
the stronger system ACA} from Definition 5.6.8. Blass, Hirst, and Simpson obtained 
the following theorem through an intricate analysis of Hindman’s original, purely 
combinatorial proof. 


Theorem 9.9.15 (Blass, Hirst, and Simpson [17]).. ACA} + HT. 
The question of whether this bound is optimal is open. 
Question 9.9.16. Is it the case that RCAg + HT # ACA*? 


Indeed, it is not even known whether Theorem 9.9.10, which shows that HT codes 
the jump, can be improved to code the double jump. 


Question 9.9.17. Does there exists a computable instance of HT every solution to 
which computes @’’? 


On the flip side, it is still possible that HT could be provable in ACAo. 
Question 9.9.18. Is it the case that ACAg HT? 


By Theorem 5.6.7, a positive answer here would require HT to admit A° solutions 
for some fixed n € w. The strongest result here, brand new at the time of this writing, 
is that in this case n would have to be at least 4. 


Theorem 9.9.19 (Yuke [332]). There exists a computable instance of HT with no 
@"’-computable solution. 


Let us turn to Theorem 9.9.15. We present a newer proof due to Towsner [311], 
who called it a “simple proof of Hindman’s theorem’. And indeed, computability and 
reverse math aside, Towsner’s is arguably the most elementary proof of Hindman’s 
theorem to date. Keeping track of the effectivity in the argument yields the following. 
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Theorem 9.9.20. For every A € 2 and every X > A‘), every A-computable 
instance of HT has an X-computable solution. 


The proof is readily formalized in RCAg. Since ACA} certainly implies WKL, this 
yields Theorem 9.9.15. 

Towsner actually proved the analogue of Theorem 9.9.20 for FUT, rather than HT 
directly. These are equivalent, as we know, but we have translated our presentation 
from finite sets and unions to integers and sums. The setup is otherwise unchanged. In 
Lemma 9.9.24, we incorporate some expositional elements from another presentation 
of the proof, due to Anglés d’Auriac [60]. We will use the following notions. 


Definition 9.9.21. Fix X C wandc: w > k, k > 1. Let b: w > Prin(w) be the 
map defined in Definition 9.9.5. 


1.Foryew, X Ly={xe X: b(x) Nb(y) = @}. 

2.ForYCw,XLY=(\yeyX1y. 

3. X half-matches y € w (relative to c) if (Ax € X)[c(y) = c(x+ y)]. 

4. X full-matches y € w (relative to c) if (Ax € X)[c(y) = c(x+ y) = c(x)]. 

5. X half-matches Y © w or full-matches Y C w if it, respectively, half-matches or 
full-matches every y € Y. 


To begin, we prove a series of technical lemmas. Each comes in two parts: first, a 
purely combinatorial fact asserting the existence of certain sets; second, a calculation 
of an upper bound on the complexity of these sets relative to the givens. The reader 
who has not yet seen a proof of Hindman’s theorem may wish, at first pass, to simply 
read the combinatorial parts, and only then go back and reread the proofs with a 
view to the effectivity estimates. Incidentally, the hypotheses of all the lemmas are 
the same. In particular, we always have a fixed HT-instance, c. All references to 
half-matching and full-matching should be understood as being with respect to this 
c, unless explicitly specified otherwise. 


Lemma 9.9.22. Suppose I C w satisfies the apartness property, and let c: FS) > 
k, k > 1 be given. 


1. For every finite set F C I, one of the following holds. 


a. There is a finite set D C I 1 F such that for everyx € FSU 1 (F UD)), F 
does not half-match x + y for some y € FS(D). 
b. There is an infinite set J C I  F such that F half-matches FS(J). 


2. If c and I are computable, then J in (b) can be chosen to be computable. 


Proof. For (1), fix F and assume (a) fails. We define a set J* = {xg,x1,...} ¢ 
min/ 1 F inductively so that F half-matches any sum of two or more elements if J*. 
Let xo = min/ 1 F. Fixi > 0, and assume we have defined x; for all j < i. Let D be 
the finite set {x; : j < i}. Then by assumption, there is anx € FS(J L (FUD)) such 
that F half-matches x + y for all y € FS(D). Let x; be the least such x. Now any sum 
of two or more elements of D U {x;} is either a sum of two or more elements of D or 
has the form x; + y for some y € FS(D). Hence, by induction, it is half-matched by 


9.9 Hindman’s theorem 325 


F. This completes the construction. Let J = {x2; +x2;+1 : i € N}. Clearly, J* satisfies 

the apartness property, and hence so does J. And as every elements of FS(J) is a 

sum of two or more elements of J*, it is half-matched by F. Thus, we have (b). 
Clearly, J is computable in c and J. So we have (2). oO 


Lemma 9.9.23. Suppose I € w satisfies the apartness property, and letc: FS(I) > 
k, k > 1 be given. 


1. There is a finite set F © I and an infinite set J C I 1 F such that F half-matches 
FS(J). 


2. If c and I are computable, then J can be chosen to be computable. 


Proof. For (1), assume not. Then for every finite set F ¢ J, alternative (a) holds 
in the previous lemma. For each i < k, we define a finite set F; inductively as 
follows. Let Fo = {min/} and suppose that for some i < k — 1, F; has been 
defined. By the previous lemma, there is a finite set D € J 1 F; such that for every 
x € FSC 1 (F; UD)), F; does not half-match x + y for some y € FS(D). Let D; be 
the least such D, and let F;,; = FS(F; U D;). This completes the construction. 

Now, let F = Fy and J =] 1 Fx. We claim that F half matches FS(/), contrary 
to our assumption. 

To prove the claim, fix any x € FS(J). We define numbers xo, ...,xx by reverse 
induction oni < k, as follows. Let x, = x, noting that x, € FS(U/ 1 F;). Now 
assume that we have defined x; for some positive i < k and that x; € FS(J 1 Fj). 
By construction, there is a y € D;_; such that F;_; does not half-match x; + y. Let 
y;-1 be the least such y, and let x;-; = x; + y;-1. Notice that x;-; € FSU 1 Fj-1) 
since F;_; C F; and D;_-; CJ 1 F;-}. 

Fix i < k, so that x; = x;4; + y;. By definition, F; does not half-match x;, + y;, so 
we must have that c(x;41+y;) # c(xj41+y;+y) forall y € F;. Now, a straightforward 
induction shows that if j < i then x; = x;,; + yj + y for some y € F;. (Induct on 
i — j.) Thus we have in particular that c(x;) # c(%i+1 + yi) = c(xi). Since, by 
Proposition 6.2.7, there is no map from k to any k* < k, it must be that every color 
is used by c(x;) for some i < k. Thus c(x) = c(x;) for some i < k. But x; =x+y 
for some y € Fx, so x; witnesses that F;, half-matches x, as was to be shown. 

For (2), note that J here was obtained by finitely many iterations of the previous 
lemma, using the same c and / at each step. oO 


Lemma 9.9.24. Suppose I C w satisfies the apartness property, and letc: FSI) > 
k, k > 1 be given. 


1. One of the following holds. 


a. There is a finite set F © I and an infinite set J © I 1 F such that F 
full-matches FS(J). 

b. There is an € < k and an infinite set J © I such that c(x) # € for all 
x € FS(J). 


2. If c and I are computable, then J in (a) can be chosen to be computable, and J 
in (b) can be chosen to be @) computable. 
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Proof. To prove (1), assume (a) fails. For each 7, we define a finite set F;, an infinite 
set J; © I, and a finite coloring c; of FS(/;). We will also ensure that Fj; ¢ J; 
and that Ji4; C J; 1 Fix1. Let Fo = 2, In = I, and cg = cS FS(Jo). Now fix 
i € w, and assume F;, J;, and c; have been defined. Apply Lemma 9.9.23 (1) to c; 
and J; to find a finite set F;,; C J; and an infinite set J;,; C J; 1 Fj4, satisfying 
the apartness condition such that F;,; half-matches FS(J;4;) with respect to c;. By 
removing an initial segment if necessary, we may also assume Jj+; > Fj41, which 
will be useful later. Define c;,; on FS(Jj+,) as follows: given x € FS(Jj+1), choose 
the least y € Fj4; such that c;(x) = c;(x + y) and let cj4;(x) = (y, c;(x)). (Thus, c 
uses at most |F;,;|  & many colors.) 

As we have assumed that (a) fails, it follows that for each i, FS(U ;<; F. ;) does 
not full-match FS(/;) relative to c. Let x; € FS(J;) be the least witness to this fact. 
Fix the least € < k such that c(x;) = ¢ for infinitely many i, and let i9 < ij <--- 
enumerate all positive such 7. Consider any n € w. We show that for eachi < in, there 
exists y; € Fj; such that if y € FS({y; : i < i,}) then c(x;, + y) = c(%,,). First, 
given i and x € FS(J;4,), let y;(x) be the least y € F;,, such that c;(x) = c;(x + y), 
as in the definition of c;,; above. Then for each 7 < in, let yj = y;(x;,). 

Let us verify that the above choice works. We claim that for all m > | and all 
Y=Vijo tee t+ Vina © FSC: 2b < inf) with jm-1 < +--+ < jo < in, the following 
holds: for all 7 < Jm-1, 


Vi Niet Vig tea) =p (9.2) 
and 
Cj (Xi, + J jo tees t Yim) = Cj (Xi, ). (9.3) 


Taking 7 = 0 in the second equality gives the desired conclusion since cg = c. By 
definition, for each 7 < jo we have 


C(x, + Y) = (yj-10%, + Y), C7-1 0%, + Y)). 


and 
cj (%i,,) = (yj-1 €j-1%,))- 


Therefore, if c; (xi, + y) = cj;(%,,) then y;-1 (4, + y) = yj-1 and cj-1(4;, + Yj) = 
cj-1(%;,). It thus suffices to prove that 


Cim-1 (Xi, + y) = Cim-1 (xi,,)> (9.4) 


and then the claim follows for y by reverse induction. We proceed by induction 
on m 2 1, verifying that (9.4) holds for m and concluding that so do (9.2) and 
(9.3). If m = 1, then y = yj, and cj) (Xi, + Yjy) = Cjp (Xi, ) by choice of y,;,. Thus, 
the claim holds for m = 1. Now suppose the claim holds for some m > 1 and fix 
Y=Vjo tet + Vint tT YVin © FS({y; :i< in}) with Jm < Jm-1 < +++ < jo. Since 
Jim < Jm-1, we have by hypothesis that 


Vim (Xin Pig Ore IG) = Vim 
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and 
Chim (Xi, + Y jo eee Vina) = Cim (xi,,). 


Together, these yield 
Chim (Xi, + y) = Chim (Xi, + Y jo eet View) a Cim (xi,)- 


Thus, the claim holds for m + 1. 

Now for each n, all y € y € FS({y; : i < in}) must satisfy c(y) # €. This 
is because by assumption, FS(U ;<i, F;) does not full-match x;,, yj € Fi+1 for all 
i < in, and c(x;,,) = c(a;, + y) = € for all y € FS({y; : i < in}). We can now 
complete the proof. Let T C w< be the following tree: a € w<® if and only if 
a(i) € F;,, for all i < |a| and c(y) # @ for all y € FS(range(q)). Clearly, T is 
finitely-branching, and our argument above shows that for every n, there is ana € T 
of length n. Hence, T is an infinite tree, and so there exists a path f € [7]. Let 
J = range(f). By construction, c(y) # ¢ for all y € FS(J). Thus we have proved 
that (b) holds. 

Now for (2), suppose (a) does not hold with a computable witness. We now use 
Lemma 9.9.24 (2) to define the sequence of F;, /;, and c;, maintaining that 7; and 
c; are computable for all 7. Thus, the hypothesis that (a) fails for computable sets, 
even if not for all sets, is enough to always guarantee the existence of Fj4; and Jj+1. 
It is easily seen that, given (indices for) F;, J;, and c;, finding (indices for) Fj 
and J;,; can be found uniformly computably in @). Thus the sequence of all the 
F; is @°)-computable, as is the tree T defined from them. This tree is also 0°?)- 
computably bounded, hence a path f € [7] can be chosen computable in 2°). As 
the construction arranges for F;,; € J; > F; for all 7, it follows that Fo < Fi <--: 
and hence that f is an increasing function. Thus, the desired solution J = range(/f) 
is computable in f, and so also in @), as claimed. im 


Lemma 9.9.25. Suppose I C w satisfies the apartness property, and let c: FS) > 
k, k > 1 be given. 


1. One of the following holds. 


a. There is a finite set F © I and an infinite set J © I 1 F such that F 
full-matches FS(J). 
b. There is an infinite set J © I such that c is constant on FS(J). 


2. If c and I are computable, then J in (a) or (b) can be chosen to be eB"). 
computable. 


Proof. For (1), again assume (a) fails. For each i < k, we define an infinite set J; C I 
and a coloring c;: FS(J;) — k —i such that: 


if c;(x) = c;(y) for some x, y € FS(J;) then also c(x) = c(y). (9.5) 


Let co = c and Jp = J, and assume that c; and J; have been defined for some i < k—-1. 
Apply Lemma 9.9.24 (1) to c; and J;. Since we have assumed that (a) fails for c and 
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T, it must also fail for c; and J; by (9.5). So, there is an € < k —i and an infinite set 
J CI; such that c;(x) # @ for all x € FS(J). Let Jj4, = J. Define cj; as follows: for 
x € FS(i+1), 
ci(x) if ci(x) < €, 
Cis (X) = 


c;(x) -1 otherwise. 


So cj41 is a (kK — € — 1)-coloring, and clearly if c;41(x) = c;41(y) then c;(x) = c;(y) 
and hence by induction, c(x) = c(y). By construction, cz_ is constant on the infinite 
set I,_1. Hence, c is constant on FS(/,_1), so taking J = I,_; gives us (b). 

For (2), we assume (1) does not hold with a @3*"') computable witness. Now 
iterate Lemma 9.9.24 (2) to define the sequence of J; and c; above, so that J; and c; 
are computable in 23), Since a’) <1 @3°") for alli < k, our assumption suffices 
to keep the iteration going. And the solution set J = J,_, is computable in ge), 
as claimed. im 


We are now ready to put all the pieces together. First, let us assemble just the 
combinatorial halves (the first parts of all the lemmas) to prove Hindman’s theorem. 


Proof (of Hindman’s theorem). Fix c: w > k, k 2 1. Applying Corollary 9.9.8, let 
I Cw be an infinite set satisfying the apartness property. 

Apply Lemma 9.9.25 to c and J. If alternative (b) holds, we are done. 

So assume not. For each 7, we define a finite set F;, an infinite set J; C J, anda 
finite coloring c; of FS(/;). Let Fo = @, I; = I, and co = c [ FSC). Suppose we 
are given Fj, /;, and c; for some i. By hypothesis, there is a finite set F ¢ J; and an 
infinite set J C J L F such that F full-matches FS(J/) relative to c. Let F;4; = F and 
Ti+) = J. Define c;,; as follows: given x € FS(J;,;) choose the least y € F;,, such 
that c(x) = c(x + y) = c(y), and let cj41 (x) = (y, c;(x)). 

Choose the least £ < k such that for every 7, there are infinitely many x € /; 
with c(x) = €. For each i, let x; be the least x € J; with c(x) = € and such that 
x > x; for all j < i. Thus we have xo < x; < ---. Analogously to the construction 
in Lemma 9.9.24, for each i we can fix yo < +++ < y;-1 with y; € Fj41 for each j, 
and such that c(y) = c(xj+ y) = c(x;+y) for all y € FS({y; : j < i}). In particular, 
c(y) = @ for all such y. 

Let T be the tree of all a € w<® with a(j) € Fj+1 for all j < |a@| and such that 
c(y) = @ for all y € FS(range(q@)). Then T is infinite, so we can fix f € [T]. Let 
J =range(f). Then c(y) = @ for all y € FS(J), as desired. oO 


Now, we use the effectivity estimates in the lemmas (the second parts) to obtain a 
complexity bound on the set J in the preceding argument. 


Proof (of Theorem 9.9.20). Take A = @. The second part of each of the lemmas 
above relativizes to any set, so also this proof relativizes without issue. So, fix a 
computable c: w — k, k > 1. By the effectivity of Corollary 9.9.8, we may pick 
out an infinite computable set J C w satisfying the apartness property. 

We now proceed in a slightly different order than before. Let Fy = @, Jp = J, and 
co = c. Assume, inductively, that for some 7 € w, we have defined a finite set F;, an 
infinite set J; € J, and a finite coloring c;: FSU/;) — k; for some k; > &k, all with 
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the same properties as before. But assume, in addition, that c; and J; are computable 
in B = @" for some n; € w. (Thus, kg = k and no = 0.) It is now that we apply 
Lemma 9.9.25, to c; and J;. Relativizing part (2) of Lemma 9.9.25 to B, we can 
quantify over infinite Bok) -computable sets to determine whether part (a) or (b) 
holds in the lemma. 

If part (b) holds, then we have an infinite Bo) -computable J ¢ J such that c; 
is constant on J. By induction, and the way the c; are defined for all j < i, it follows 
that also c is constant on J. Hence, we are done, and our solution is arithmetical. 

If (a) holds, then we instead have a finite set F C J; anda B®") computable 
J CI. F such that F full-matches FS(J/), and these can be used to define F;,1, 
Ti+1, and cj4; as before. Thus, cj;+; will be a (|Fi+1| x &;)-coloring of FS(Ji+1). By 
cutting off an initial segment if necessary, we may assume Jj4; > Fj41. 

Letting nj4) = (3k-1) -nj;, we have that cj,; and J;4; are @("i+1)computable. 
Notice that to quantify over infinite BO~>)-computable sets above requires a 
(BED) oracle. Hence, indices for each of Fj41, Jj41, and cj41, as well as the 
number n;41, can be found uniformly (B3"))”.computable from F;, J;, c;, and n;. 
This completes the construction. 

Now, if part (a) holds for every i, then we end up constructing the sequence 
Fo < F, <.--- as before. By the preceding paragraph, this sequence is uniformly 
computable in @'”), hence so is the tree T defined from the F; in the proof of 
Hindman’s theorem. As this tree is @‘“) -computable and @‘“) -computably bounded, 
any X > @(”) computes a path f € [T] (Exercise 4.8.5). Since, for each i, we 
ensured that J; > F;, f is an increasing function and so J = range(f) is computable 
from f and therefore from X. As before, J is an HT-solution to c. oO 


The notions of half-matching and full-matching have been generalized and ex- 
tended by Anglés d’Auriac, Monin, and Patey (see [60]), who used them to give 
a set of combinatorial conditions under which Question 9.9.16 (and, relatedly, the 
question of whether HT admits arithmetical solutions) would have a positive answer. 


9.9.3 Variants with bounded sums 


We now consider two classes of variations of Hindman’s theorem, which are quite 
interesting in their own right, but also help further elucidate some of the connections 
between the computable and combinatorial content of HT. 


Definition 9.9.26. Fix X C w. 


1. For each n > 1, FS~™"(X) denotes the set 
{xo t++++Xn-1 2 (Wi <n)[xi Ee XA (Vi <n)[i ¢ jf oxi #x,]]}- 
2. For each n.> 1, FS*"CO) =U temen FSX). 


Thus, FS(X) = Un>1 FS“"(X) = Un>1 FSS"(X). 
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Definition 9.9.27 (HT for bounded and exact sums). Fix n,k > 1. 


1. For alln,k > 1, HT,” is the following statement: every c: w — k, there is an 
infinite set J C w such that c is constant on FS~”’ (J) x. 

2. For alln,k > 1, HT,” is the following statement: every c: w — k, there is an 
infinite set J C w such that c is constant on FSS<"(J). 

3. For all n > 1, HT~” is the statement (Vk > 1) HT;” and HT<" is the statement 
(Vk > 1) HT". 


For P any of these statements, P with apartness denotes the same statement, aug- 
mented to include that the solution set J satisfies the apartness property. 


The original interest here was from Hindman, Leader, and Strauss [144], who 
observed the following rather surprising fact. Combinatorially, it would seem that 
HT< is a much simpler problem than the full version of Hindman’s theorem, HT. 
And yet, it seems just as hard to prove the former as it is the latter. This motivated 
the authors to pose the following question. 


Question 9.9.28 (Hindman, Leader, and Strauss [144]). Is there a proof of HT< that 
would not also be a proof of HT? 


This is reminiscent of the comparison of Milliken’s tree theorem and the product 
form of Milliken’s tree theorem in Section 9.7. Indeed, we can quickly recast this 
into the following reverse mathematics question. 


Question 9.9.29. Does ATS — HT over RCAg (or even ACAg)? 


And irrespective of the answer, we can also ask the following analogue of Ques- 
tion 9.9.16. 


Question 9.9.30. Is it the case that ACAg t HT$*? 


Both questions remain (tantalizingly!) open. In Section 10.6.2, we discuss an iter- 
ated version of Hindman’s theorem which is also provable in ACA‘, and which is 
equivalent over ACAo to several other principles. 

The principle HT~" is a bit different from HTS". For one, observe that for all n, k > 
1, HT” is identity reducible to RT” and HT,” is identity reducible to RT}. (Given an 
HT,,”-instance c, define an RT;’-instance d by d(xo,... »Xn-1) = C(XQ +++ +Xp-1). 
Now if H is any infinite homogeneous set for d then c is constant on FS™(#).) 
So the analogue of Question 9.9.29 for HT has a negative answer: we know that 
RT*, and hence HT~’, does not imply ACA, but that HT does. We also get an easy 
positive answer to the analogue of Question 9.9.30 for HT™”, since we know that 
ACAo + RT”. 

Notice that this argument does not work also for HTS”. All we can get by applying 
RT” to an instance c of ‘aad is an infinite set J C w such that c is constant on FS~”" (1) 
for all m <n. But c may take different colors on FS~”(/) for different values of m. 

The relationships between the principles in Definition 9.9.27, for different values 
of n and k, are far less obvious than, say, for RT; and much is still unknown. For 
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example, it is not the case, at least not by the argument we gave before, that P is 
always equivalent to P with apartness. The following result shows that for HTS” w 
can recover this in part. 


Theorem 9.9.31 (Carlucci, Kolodziejczyk, Lepore, and Zdanowski [26]). Fix 
n>2andk2>1 


I. RCAg + Hise > Hie" with apartness. 
2. HT” with apartness <s- HT;, a 


Proof. Let us prove (2). Fix a coloring c: w — k. We define d: w — 2k, as 
follows. First, define a maps i,a: w — w as follows. Given x € w, fix n and 
o,--+,4n-1 € {0,1,2} such that x = );<,, a; - 3!. (Thus, we are expressing x in 
base 3.) Then, let i(x) be the least i < n such that a; > 0 (or 0 if x = 0), and let 
a(x) = a;. Now, we define d: for each x € w, let 


ji c(x) if a(x) = 1, 
_— k+c(x) ifa(x) =2 


Clearly, d <r c. Let J © w be an HT;,’-solution to d. Say d(x) = j < 2k for all 
x € I. If j < k, then we must have ars = | for all such x, and otherwise we must 
have a(x)(x) = 2. We claim that i is injective on J. Suppose not, and say i(x) = 1(y). 
Since n > 2, we have that c(x + y) = j and hence either a(x) = a(y) =a(x+y)=1 
or a(x) = a(y) = a(x+y) = 2. But this is impossible: if a(x) = a(y) = 1 then 
a(x+y) = 2, and if a(x) = a(y) = 2 then a(x+y) = 1. Thus, the claim holds, 
and now it is easy to computably thin / to an infinite set satisfying the apartness 
condition. oO 


If we add the apartness condition to HT™’,, we can also get a converse to the fact 
that ACAp + HT~”. Thus, our best known lower bound for the full Hindman’s theorem 
is already present in this (comparatively weak) principle. 


Theorem 9.9.32 (Carlucci, Kolodziejczyk, Lepore, and Zdanowski [26]). For 
eachn > 3 and k > 2, the following are equivalent over RCAo. 


I, ACAo. 
2. HT,” with apartness. 


Proof. First, we argue in ACAo. Fix a coloring c: N — k. By Corollary 9.9.8, fix 
an infinite set J C N satisfying the apartness condition. Define d: [J]? — 2 as in 
our discussion above: d(x, y, z) =x + y+ z. Applying RT’, which is provable from 
ACApo (Corollary 8.2.6), we get an infinite homogeneous set H C J for d. Clearly H 
satisfies the apartness condition (being a subset of /), and c is constant on FS~" (#7) 
by construction. 

For the reversal, we argue in RCAg+HT,”. Let f: N — N bean injective function. 
We show that range(f) exists. Recall the notion of gap from Definition 9.9.11. Say 
a gap {u, v} in a number x is important if 
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(dy < A(x) [y € range(f Tu, v])]. 


We define c: N > 2 by c(x) = 0 if the number of important gaps in x is even, and 
c(x) = 1 if the number of important gaps in x is odd. Being an important gap is 
2p-definable, so c exists. Fix J infinite and satisfying the apartness property such 
that c is constant on FS~"(/), say with value j < 2. 

We make the following two claims about /. 


Claim 1: For each x € I, there exists x, > x in I such that c(x +x,) = j. To see this, 
fix x and let z be such that 


(Vy < A(x))y € range(f) — y € range(f [z)]. (9.6) 


(See Exercise 6.7.3). Consider any x; < --- < x,-; in J with max{x, z} < x;. Since 
I satisfies the apartness property, we have 


A(x) = A(X +x1) =A(K4+X] +°++4+Xy-1), 


and so by choice of z, each of x, x +x, andx+x,+--++X,_; have the same number 
of important gaps. Thus in particular, 


C(x +x) = c(x +x, +-°°°4+2Xp-1) = J 


since X +X, +++++Xp;-1 € FS"(/). 


Claim 2: For all x < x; in I with c(x +x 1) = j, we have 


(Vy < A(x))Ly € range(f) > y € range(f fT u(x1))]. 


Let such x < x; be given, and let z be as in (9.6). Fix x2 < +++ < Xp- in J with 
max {x 1, z} < x2. By choice of z, no gap {u, v} with A(x2) < u < v is important in 
X+X1 ++ +++Xy-1.So, if {(x1), A(x2)} were important then there would be exactly one 
more important gap in x+x,+-+++x,-; than in x+x,. But x+x)+-++xX,-1 € FS" (/), 
so C(x +x, ++++X,_1}) = Jj and therefore c(x + x1 +--+X,_1) = c(x + x1). Thus, 
{u(x1), A(x2)} cannot be important in x + x] +---+xX,_1, and consequently we must 
have z < p(x}). 
By the first and second claims, we conclude that y € range(f) if and only if 


(Wx,x1 €D[y < A(X) Ax <x Ac(x+x1) =j > y € range(f f u(%1))]. 
Thus, we have a mm definition of the range of f, so the range exists. oO 


Thus, for n > 3 and k > 2, we have in particular that HT~” with apartness, HT,” 
with apartness, RT”, and RT; are all equivalent. The situation is therefore much as 
we might expect naively. For example, we indirectly get many familiar facts, like that 
HT=" > HT=""!, that HT,” and HT,” are equivalent, etc. 

On the other hand, the proof of Theorem 9.9.32 is somewhat surprising. More 
precisely, the coding used is quite different from that used to prove the analogous 
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result for RT” (Corollary 8.2.5). And in fact, the same coding can be used to prove 
the following result, which, given that it is a statement about pairs, is arguably more 
unexpected. 


Theorem 9.9.33 (Carlucci, Kolodziejczyk, Lepore, and Zdanowski [26]). 
RCAg + Ht with apartness —> ACAo. Hence, RCAo + HT<* — ACAo. 


The proof is left to Exercise 9.13.12. Using another elaboration on the same idea, it 
is also possible to prove the following, whose proof we omit. 


Theorem 9.9.34 (Dzhafarov, Jockusch, Solomon, and Westrick [85]). 
RCAg t HTS — ACAo. 


As noted, we do not even know if HT is provable in ACAg. Thus, we do not have 
equivalences above (at least, not yet). 


In terms of lower bounds, the preceding results leave open the situation for HTS? 


2 
and pubs 2 < k <4. We show that none of these principles is trivial from the point 


of view of reverse mathematics by establishing the following lower bound. 


Theorem 9.9.35 (Dzhafarov, Jockusch, Solomon, and Westrick [85]). 
RCAy + HTS” — SRT}. 


Proof. We argue in RCAg + a and derive D. Fix a stable coloring c: [N]? — 2. 
Let i and a be the maps N — N defined in the proof of Theorem 9.9.31. Define 
d: N > 2 as follows: for all x € N, 
i= c(i(x), x) : a(x) =1, 
1—c(i(x),x) if a(x) =2. 


Let / be an infinite set such that d is constant on FS<?(J). We will need to prove 
several claims. 


Claim I: For alli € wanda € {1,2}, there are at most finitely many x € I 
with i(x) = i and a(x) = a. Fix i and a, and suppose towards a contradiction that 
J = {x €1:i(x) =iAa(x) = a} is infinite. Let @ be the limit color of i under c, and 
let so be large enough so that c(i, z) = @ for all z > so. Since J is infinite, we can find 
x,y € J with 59 < x < y. So, c(i, x) = c(i,x + y) = €. Now, because i(x) = i(y) =i 
and a(x) = a(y) =a, we have i(x+ y) =iand a(x+y) # a. Thus, depending as a is 
1 or 2, we either have d(x) = €and d(x+y) = 1-€, or d(x) = 1-@andd(x+y) =€. 
Either way, d(x) # d(x + y). But x and x + y both belong to FS<?(J) C FS<?(J), so 
we should have d(x) = d(x + y). This is a contradiction. 


Claim 2: For alli € w anda € {1,2}, ifx € I satisfies i(x) =i and a(x) = a then 


d(x) ifa=1, 


Teta) ( —d(x) ifa(x) =2. 
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Fix 7, a, and x. As above, let € be the limit color of i under c, and let so be large 
enough so that c(i, z) = ¢ for all z > so. Since / is infinite, it follows by Claim | that 
there is a y > sg in J with i(y) > i. We thus have c(i, x + y) = €. And since i(x) =i, 
we also have i(x + y) =i. Thus, because x + y € FS<*(J), we conclude that 


& Bei 
d(x) =d(x+y)= ; 
y= dey) ‘i ea 


This proves the claim. 


To complete the proof, apply RT} to find a pair (j,a) € 2 x 2 such that for 
infinitely many x € I, d(x) = j and a(x) = a. By Claim 1, the collection of all i such 
that i = i(x) for some x € J with d(x) = j and a(x) = a is infinite. Moreover, as this 
collection is >°-definable, it follows by Exercise 5.13.5 that there is an infinite set 
consisting only of such /. Call this set L. By Claim 2, L is limit homogeneous for c.O 


Interestingly, we can also get this bound for HT5” with apartness. In fact, here we 
can prove slightly more. Recall the principle IPTS from Section 9.3. 


Theorem 9.9.36 (Carlucci, Kolodziejczyk, Lepore, and Zdanowski [26]). 
RCAg + HT5* with apartness > IPT2. 


Proof. We argue in RCAo. Fix a coloring c: [N]? — 2 and define d: N > 2 as 
follows: for all x € N, 


d(x) = {0A@)#O) if AG) # uO), 
: if A(x) = W(x). 


Let / be an infinite set satisfying the apartness property and such that d is constant on 
FS=?(/), say with value £ < 2. Say I = {xp < x, < ++: },andlet Hp = {A(xo; : i € N} 
and A, = {u(xoi41 : i € N}. We claim that (Ho, H;) is an increasing p-homogeneous 
set for c, in fact with color @. To see this, choose x < y withx € Ho and y € Hy. Then 
x = A(x;) for some even i and y = yu(x;) for some odd j. In particular, i # j. Now if 
j <i then also x; < x;, which implies that u(x;) < A(x;) by virtue of J satisfying 
the apartness property. Since this is not the case, we must have i < j. Hence, again 
because J satisfies the apartness property, we also have that u(x;) < A(x;). So, 
xX = A(x;) = A(xji +-x;) and y = w(x;) = way +-x;). Since xj +x; € FS=*(1), it now 
follows by definition that 


c(x,y) = cA + x;), AQ + x;)) = 0. 


The only nontrivial principle from Definition 9.9.27 remaining for which we have 
not established a lower bound above RCAg is HT~? (and HT;’, k > 2), without 
apartness. We turn to this principle in the next section. Before wrapping up this 
section, though, let us briefly note a few other variants of HT that have been considered 
in the literature. First, Carlucci, Kotodziejczyk, Lepore, and Zdanowski [26] also 
looked at “bounded versions” of the principle FUT. The principles FUT”, FUT,”, 
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FUTS", and FUT" can be defined in the obvious way. As in Proposition 9.9.14, each 
of these is equivalent to the corresponding version of HT, with apartness. Carlucci 
[25, 24] also considered several “weak yet strong” versions of HT, with the property 
that they not only imply ACAo (like the full HT) but they are easily provable in 
ACAg (and so seem weaker). A contrasting view might be that these give credence to 
Question 9.9.30 having an affirmative answer. Carlucci [25] also showed that several 
well-known partition regularity results, including Van der Waerden’s theorem, can 
be expressed in terms of such restrictions of HT. 
We summarize the relationships established in this section in Figure 9.4. 


<2 
HT; 


| 


Hi with apartness 


\ 


HTS? ACAy <—> HT” with apartness 


WKLo 


RCAo 


Figure 9.4. Relationships between some of the versions of Hindman’s theorem for bounded sums. 
Here, n > 3 and k > 3 are arbitrary. Arrows denote implications over RCAg; double arrows are 
implications that cannot be reversed. 
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9.9.4 Applications of the Lovasz local lemma 


The goal of this section is to present a proof of the following theorem of Csima, 
Dzhafarov, Hirschfeldt, Jockusch, Solomon, and Westrick [57]. 


Theorem 9.9.37. Hee omits computable solutions. 


This turns out to be surprisingly different from the case of the other bounded ver- 
sions of Hindman’s theorem considered above. For all of those, we were able to 
establish implications to principles above RCAp (whether to ACAg, SRT, or IPT5) 
by clever, but nonetheless elementary, combinatorial arguments. For HT~? (without 
the assumption of the apartness property!) any kind of direct combinatorial coding 
turns out to be very difficult. 

Instead, the proof of Theorem 9.9.37 makes an unexpected use of a deep proba- 
bilistic method, the Lovasz local lemma. Originally proved by Erdés and Lovasz [98], 
the lemma has numerous important applications in combinatorics, especially for ob- 
taining nonconstructive existence results. Informally, the Lovasz local lemma asserts 
that if we have a (potentially infinite) series of sufficiently independent events, each 
of which has only a low probability of occurring, then there is a nonzero chance that 
none of them occur. 

Here, we will actually only be interested in a very specific version of the Lovasz 
local lemma, and an effectivization of this version at that. So, we omit explicitly 
stating the lemma in full generality and full technical detail, and instead refer the 
reader to the book of Graham, Rothschild, and Spencer [128] for a thorough discus- 
sion, including a proof. Actually, there are many nice and self-contained proofs of 
the Lovasz local lemma (see, e.g., [327]). The following is an effectivization of the 
so-called asymmetric Lovasz local lemma introduced by Spencer [301]. 


Theorem 9.9.38 (Rumyantsev and Shen [263]). For each q € QN (0,1) there 
isan N € was follows. Suppose (0, 0,...) is a computable sequence of partial 
functions w — 2 such that the following hold. 


1. For each i, dom(o;) is finite and has size at least M. 

2. For each n > N, each x € w is in the domain of at most 24" many oj with 
|dom(o;)| = 7. 

3. For each n > N and each x € a, the set of i such that x € dom(a;) and 
|dom(o;)| = is uniformly computable in n and x. 


Then there is a computable function c: w — 2 such that for each i, c(x) = 0; (x) for 
some x € dom(a;). 


Let us see the theorem in action and derive Theorem 9.9.37 using it. We will discuss 
more applications afterward. In what follows, if F C w and y € w, write F + y for 
the set {x €w: (Aw)[we FAx=wty]}. 


Proof (of Theorem 9.9.37). As usual, we prove the result in unrelativized form. Our 
goal is thus to define a computable coloring w — 2 such that if the c.e. set We is 
infinite, then c is not constant on FS=?(W-). To achieve the latter, we will ensure that 
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there is a finite set F C We. such that c is not constant on F + y for any sufficiently 
large y. Clearly, this yields the desired conclusion. The idea now is to obtain c by a 
suitable application of Theorem 9.9.38. 

Here, we will follow the convention that for every e, We.o = @ and |We.s41 \ 
W..s| < 1 for all s. (That is, in the computable enumeration of W,, at most one new 
element is enumerated at each stage.) 

Fix any g € QQ” (0,1). Say, for concreteness, g = 1/2. Let N be as given 
by Theorem 9.9.38 for this g. By increasing N if necessary, we may assume that 
n < 29"! for all n > N. For each e € w, let ne = N + e, and for each k = (e, s, y) 
define a finite set E; as follows. If |W.,,| =n- and |W..,| # n for any t < s, we let 
E; = W-,s, and otherwise we let E, = @. Thus, either |E,| =n. or |Ex| = 0. 

We claim that for each n > N, every x € wis in at most n many Ex with |E,| =n. 
Say n = ne. If We does not contain at least n elements, then there are no Ex with 
|E;| = n, so there is nothing to prove. Otherwise, we can let s be least such that 
|We.s| =n. Then if x € Ex, and |E,| =n, we must have that k = (e, s, y) for some y, 
and that x = w+ y for some w € W,.,. But there are only n many elements in Wz .s, 
so at most n many different y such that x — y belongs to W,.;. This proves the claim. 

It is clear that for each n > N and each x € w, the set of k such that x € Ex 
and |Ex| =n is uniformly computable in n and x. Let e = n— N, so that n = ne. 
Now, given k, we have that x € Ex if and only if k = (e, s, y) for some s and y and 
x=w+y for some w € Wz .s. 

To complete the proof, we define a computable sequence (a; : i € w) of partial 
functions w — 2 with finite domain. Fix i, and assume we have defined 07; for all j < 
2i. First, find the least k = (e, s, y) such that |We,s| = ne, |We.r| # ne for any t < 5, 
and no a; with j < 2i has dom(c;) = Ex. Then, let dom(o2;) = dom(o2;+41) = Ex, 
and define 0>;(x) = 0 and o;4;(x) = | for all x € E,. Thus, for every k, if Ex, # @ 
then E;, = dom(a;) for some i. 

By construction, the sequence (o; : i € w) satisfies clauses (1) and (3) in the 
statement of Theorem 9.9.38. Given n > N and x, since x is in at most n many 
Ex with |E,| =n, it follows that x is in the domain of at most 2n many o; with 
|dom(a;)| = n. But 2n < 2-297"! = 24” by choice of N, so (a; : i € w) also 
satisfies clause (2). 

So, let c: w — 2 be the computable function given by Theorem 9.9.38. Fix 7, and 
let k be such that dom(o»;) = dom(o;+;) = Ex. Then there exist x9,x; € Ex such 
that c(x9) = 02; (x0) = 0 and c(x1) = o2;41(x1) = 1. Hence, c is not constant on Ex. 
But for every infinite W, there is a finite F C W. (namely, F = W¢.s for the least s 
for which |W..;| =~) and numbers k, y such that F + y = Ex. So as mentioned at 
the outset, c is not constant on FS=?(W,). The theorem is proved. oO 


As noted in [57], the key insight that the Lovasz local lemma might be of use 
in proving Theorem 9.9.37 is the following. First, observe that if F C W- is finite 
and y and y* are far enough apart, then F + y and F + y* are disjoint. Thus, how a 
coloring c: w — 2 is defined on F + y is entirely independent of how it is defined 
on F + y*. Furthermore, the probability that F + y is homogeneous for c is only 
2-lF!_ and hence small if |F| is large. If we think of coloring finite sets as events, we 
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can thus recognize the general shape of the premise of the Lovasz local lemma, as 
informally described earlier. 

We omit the proof of the following theorem and corollary of Csima, Dzhafarov, 
Hirschfeldt, Jockusch, Solomon, and Westrick [57], which can be proved by a more 
careful implementation of Theorem 9.9.38. This extends Theorem 9.9.37 and has a 
couple of interesting corollaries. 


Theorem 9.9.39 (Csima, et al. [57]).. There is a computable instance of allie all 
of whose solutions are DNC relative to @’. 


Corollary 9.9.40 (Csima, et al. [57]). 


1. MTs" omits x solutions. 
2. RCA + HT5” > RRT3. 


Proof. For (1), note that solutions to instance of Hg are closed under infinite 
subset. Every infinite = set has an infinite AS subset, but of course no AS set can be 
DNC relative to @’. Part (2) is obtained by formalizing the proof of Theorem 9.9.39 
in RCAo and applying Theorem 9.4.13. Oo 


The Lovasz local lemma (or more precisely, the computable version due to 
Rumyantsev and Shen, Theorem 9.9.38) has also found applications to other prob- 
lems in reverse mathematics and computable combinatorics, for example in the 
work of Liu, Monin, and Patey [198] on ordered variable words. Hirschfeldt and 
Reitzes [151] have also recently employed it in their study of “thin set versions” of 
Hindman’s theorem. And an analysis of constructive properties of other versions of 
the Lovasz local lemma has recently been undertaken by Mourad [223]. 


9.10 Model theoretic and set theoretic principles 


Our discussion will now appear to digress from Ramsey’s theorem and all things 
combinatorial, and delve instead into some model theory and set theory. We say 
“appear” because (as may be inferred from the fact that this section is still included 
in a chapter on combinatorics) the principles we will discuss nevertheless turn out 
to be deeply combinatorial, and indeed, even related to some of the consequences of 
RTS discussed above. 


9.10.1 Languages, theories, and models 


We begin with some background on effectivizing and formalizing the basic el- 
ements of logic and model theory. Complete introductions to computable model 
theory, also known as computable structure theory, are given by Ash and Knight [4], 
Harizanov [136], and Montalban [221]. 
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Throughout, all formal languages we will consider will be countable. This makes 
formalizing many notions straightforward, just with w used as the domain of all 
representations. For example, a Janguage is formally a sequence of disjoint subsets 
V,C, and for each k > 0, Fx and Rx, of w (any of which may be empty). We 
interpret these in the usual way: VV is the set of variables, C the set of constant 
symbols, Fx the set of function symbols of arity k, and Rx the set of relation symbols 
of arity k. 

By using our coding for finite sequences of numbers, given a language we can 
in turn represent terms, formulas, and deductions (relative to some fixed standard 
deductive system) as subsets of w. In the classical setting, these codings are developed 
in standard accounts of Gédel’s incompleteness theorem. The codings are used for 
much more general purposes in computable model theory. 

The sets of (codes for) terms and formulas are each clearly computable from the 
language, as is the set of codes for deductions (as we can assume our deductive system 
is computable). All of this can be directly formalized in RCAg, and in particular, 
given a language, RCAg can prove that each of these sets exists. 

The coding extends to sets of formulas, which are naturally regarded as subsets 
of w via codes. The notions of consistency and completeness of a set of formulas 
are defined as usual. We distinguish arbitrary sets of formulas from theories, which 
we understand to be sets of formulas closed under deduction. This is a nontrivial 
distinction, because RCAog is not able to form the deductive closure of a set of formula, 
as the following proposition shows. We leave the proof as an exercise. 


Proposition 9.10.1. Over RCAo, the following are equivalent. 


I. WKLo. 
2. For every language L£, every consistent set of £-sentences is contained in a 
consistent L£ theory. 


In computable structure theory, a computable theory (or more generally, an A- 
computable theory, for some A € 2) is more commonly called a decidable theory 
(an A-decidable theory). 

For structures, we restrict first of all to those with countable domain. Since we 
are working in a countable language anyway, this is of no great consequence by 
the downward Lowenheim-—Skolem theorem. The domain of a structure can thus be 
identified with a subset of w. The rest of the structure then includes a subset of the 
domain interpreting the constant symbols, along with functions and relations on this 
domain of appropriate arities interpreting the corresponding function and relation 
symbols. We also always assume that a structure comes together with its elementary 
diagram, i.e., the set of all (codes for) sentences with parameters for elements of 
the domain that are true in the structure. By taking disjoint unions, a structure can 
again be coded by a single subset of w, and if this is computable (A-computable) the 
structure is called a decidable structure (A-decidable structure). So in particular, in 
a decidable structure the satisfaction predicate, &, is computable. 

In computable structure theory there is also interest in structures where the ele- 
mentary diagram is not included. The computable (A-computable) such structures 
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are then simply called computable structures (A-computable structures). In these, 
only the atomic diagram is available. Every decidable structure is computable, but 
not conversely. For example, the standard model of Peano arithmetic is computable, 
but if it were decidable then its elementary diagram would be a computable (hence 
computably axiomatizable) complete consistent extension of Peano arithmetic, con- 
tradicting Gédel’s first incompleteness theorem. This argument can be extended to 
yield Tennenbaum’s theorem (see [308]), that the only computable model of Peano 
arithmetic is the standard one. 

However, it is possible for a computable structure to have a decidable copy, say, or 
for every computable model of some theory to in fact be decidable. In general there 
are many aspects of computable and decidable structures that can be considered. See, 
e.g., Cholak, Goncharov, Khoussainov, and Shore [29] or Harrison-Trainor [139] for 
more thorough discussions. Our interest in this section will be in the stronger type 
of model, with elementary diagram included, as is common in reverse mathematics. 


9.10.2 The atomic model theorem 


We will consider several theorems of classical model theory concerning the omitting 
and realizing of types. We quickly review the key underlying notions. For a more 
complete introduction to these topics, consult any standard introduction to model 
theory, e.g., Marker [202] or Sacks [265]). 


Definition 9.10.2. Fix a (countable) language L. 


1. A formula y is an atom of a theory T if for every formula y in the same free 
variables, exactly one of T+ y > worT + y > -7w holds. 
2. A theory T is atomic if for every formula w such that T + w is consistent there is 
an atom y of T such that T+ y > Ww. 
. A set I of formulas in the same fixed variables is: 


ioe) 


a. a partial type of a theory T if T + T is consistent, 
b. a (complete) type of a theory T if it is a partial type not properly contained 
in any other partial type. 


4. A partial type I" is principal (or isolated) if there is a formula y such that T + ~ 
is consistent and T+ y — w forall wy € I. We say ¢ isolates T. 
5. A partial type I’ of a theory T is: 


a. realized in a model & t T if there is a tuple b of elements of B such that 
Be w(b) forallw eT, 
b. omitted in a model & ¢ T if it is not realized in B. 


6. For a theory T, a model 8 t T is atomic if the only types of T realized in B are 
principal. 
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Throughout this section, we will be interested exclusively in complete theories. If 
T is complete, then every principal partial type of T is realized in every model of T. 
(Suppose I’ is isolated by y. Then T'+ ¢ is consistent, hence by completeness, T must 
include (Ax) p(x). It follows that if 8 is any model of T then y(b) holds in S for 
some tuple b of elements of B. Hence, w(b) holds in 8 for all Ww € I.) By contrast, 
the famous omitting types theorem, due to Henkin [141] and Orey [236], says that if 
Tis nonprincipal then it is omitted in some countable model of T. (More generally, 
any countable collection of nonprincipal types can be omitted in some such model.) 
An atomic model (of a complete theory) is thus one which omits every type it can. 

Classically, a theory has an atomic model if and only if it is atomic. Hirschfeldt, 
Shore, and Slaman [153] showed that the “only if” direction of this equivalence was 
provable in RCAo. They called the converse the atomic model theorem. 


Definition 9.10.3 (Hirschfeldt, Shore, and Slaman [153]). The atomic model the- 
orem (AMT) asserts: every complete atomic theory has an atomic model. 


This principle has its origins in much older work in computable structure theory 
going back to the 1970s. Millar [212] constructed a complete atomic decidable 
theory with no computable atomic model, which in particular means that AMT is 
not computably true. Goncharov and Nurtazin [127] and Harrington [137] showed 
that a complete atomic decidable theory has a decidable model if and only if there 
is a computable listing of the theory’s principal types. (Here, by a listing of a class 
C ¢ 2° we mean a sequence (X; : i € w) such that X; € C for each i and each 
X € C is equal to X; for some (possibly many) i € w.) This is a very useful result, 
which shows by relativization that, to construct an A-computable atomic model of 
a given complete atomic decidable theory, it suffices to produce an A-computable 
enumeration of its principal types. This is actually a combinatorial task, as provided 
by the following definition and theorem. 


Definition 9.10.4. Fix a tree U C 2<®. 


1. f € [U] is isolated in U if there is ao € U such that f is the only path of U 
extending o. We say o isolates f in U. 

2. The isolated paths through U are dense in U if every o € T is extended by some 
isolated f € [U]. 

3. A listing of the isolated paths of U is a sequence (f; : i € w) such that each 
fj is an isolated path of U and each isolated path of U is equal to f; for some 
(possibly many) 7 € w. 


Theorem 9.10.5 (Csima, Hirschfeldt, Knight, and Soare [58]). 


1. If T is a complete atomic decidable theory, there is an infinite computable tree 
U C 2<® with no dead ends and isolated paths dense such that: 


a. every path of U computes a type of T, 
b. every type of T computes a path of U, 
c. every listing of the isolated paths of S computes a listing of the principal 


types of T. 
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2. IfU © 25 is an infinite computable tree with no dead ends and isolated paths 
dense, there is a complete atomic decidable theory T such that: 


a. every path of U computes a type of T, 

b. every type of T computes a path of U, 

c. every listing of the principal types of T computes a listing of the isolated 
paths of U. 


Thus, to understand the degrees of atomic models of computable atomic decidable 
theories we can study the degrees of listings of isolated paths of trees U as above. 
Csima, Hirschfeldt, Knight, and Soare [58] called the degrees of such sets prime 
bounding, in reference to prime models. (Prime models are those that elementarily 
embed into every model of the same complete theory. These coincide with atomic 
models for countable theories.) Thus, X has prime bounding degree if and only 
if it computes a solution to every computable instance of AMT. In [58], several 
characterizations of the prime bounding degrees were provided, with the following 
being a notable example. 


Theorem 9.10.6 (Csima, Hirschfeldt, Knight, and Soare [58]). If X <; @’ then 
X has prime bounding degree if and only if it is nonlow2. 


Since AMT is not computably true, we get a lower bound in the form of non- 
provability in RCA. The following upper bound complements Theorem 9.10.5 in 
providing a combinatorial connection for AMT. 


Theorem 9.10.7 (Hirschfeldt, Shore, and Slaman [153]). RCAg proves that SADS 
implies AMT. 


The proof is a fairly involved priority construction, requiring the method of Shore 
blocking (discussed in Section 6.4) to formalize in RCAg. For a nice self-contained 
presentation of the argument, see [147]. The main takeaway of the result, beyond 
the surprising relationship to linear orders, is that AMT is actually a relatively weak 
principle. In fact, it is strictly weaker than SADS. 


Theorem 9.10.8 (Hirschfeldt, Shore, and Slaman [153]). AMT <,, SADS. Hence, 
also RCAg AMT — SADS. 


By Theorem 9.2.19, SADS admits low solutions, and hence so does AMT. In par- 
ticular, there is an w-model of AMT consisting entirely of low sets. However, by 
Theorem 9.10.6, no such model can consist of sets all below a single fixed low 
set. (Since every w-model contains every complete atomic decidable theory, any set 
computing every member of an w-model of AMT has prime bounding degree.) On 
the other hand, by Theorem 4.6.16 there is such a w-model of WKL, so in particular, 
AMT <,, WKL. 

On the first order side of things, we have the following result concerning the 
notion of restricted II, conservativity from Section 8.7.4. 


Theorem 9.10.9 (Hirschfeldt, Shore, and Slaman [153]). AMT is restricted IL, 
conservative over RCAo. 
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The proof is similar to that of Theorem 8.7.23, but using Cohen forcing instead of 
Mathias forcing. (The analogue of Lemma 8.7.24 is that, if M is a model of RCAo 
and w isa = formula such that M § (VY)[=W(A, Y)] for some A ¢ S™, then the 
same holds in M[G] for G sufficiently generic for Cohen forcing in M.) It follows, 
for one, that AMT does not imply BE), which in combination with Theorem 9.10.8 
means it does not imply any other principles in Figure 9.1. 

But AMT does imply some new principles. Millar [212] showed that the classical 
omitting types theorem admits computable solutions. The following stronger version, 
where (complete) types are replaced by partial types, turns out to be more interesting. 


Definition 9.10.10 (Hirschfeldt, Shore, and Slaman [153]). The omitting partial 
types principle (OPT) is the following statement: for every complete theory T and 
every enumerable collection C of partial types of T there is a model of T omitting 
all nonprincipal elements of C. 


Proposition 9.10.11 (Hirschfeldt, Shore, and Slaman [153]). RCAj + AMT —> 
OPT but AMT <,, OPT. 


The striking fact proved in [153] and used in separating OPT from AMT, is that OPT 
has a characterization purely in computability theoretic terms. 


Theorem 9.10.12 (Hirschfeldt, Shore, and Slaman [153]). RCAg + OPT — 
(VX) (AY) [¥ is hyperimmune relative to X]. 


This is reminiscent of Miller’s Theorem 9.4.13, which characterized the rainbow 
Ramsey’s theorem in terms of functions DNC relative to @’, only here we have 
an even more ubiquitous computability theoretic property. Any principle that omits 
solutions of hyperimmune-degree (e.g., COH, RRT3>, EM) therefore implies OPT, at 
least over w-models. In contrast, any principle that admits solutions of hyperimmune 
free degree does not imply OPT. Notably, OPT ¢,, WKL. 

The way this can be used to build an w-model of OPT in which AMT fails is as 
follows. Fix a low? c.e. set A. Using Sacks’ theorem that the c.e. degrees are dense 
(meaning, if X <7 Y are c.e., then there exists ac.e. set Z such that X <7 Z <7 Y), 
we can build a sequence of c.e. sets, Ag <p Ai <7 --: <7 A. Then for each i we have 
A; <tT Aist <t A <7 @’ <7 Aj, So by Theorem 2.8.20, relativized to Aj, it follows 
that A;,, is hyperimmune relative to A;. Hence, S = {Y € 2” : (Ai)[Y <_ A;]} is 
an w-model of OPT. On the other hand, by Theorem 9.10.6, no low2 set below @’ 
can bound a model of AMT. (A similar argument can be used to separate OPT from 
virtually every principle we have discussed so far.) 

In the use of Sacks’ density theorem above, we have a rare application in reverse 
mathematics of a classical computability theoretic result proved by an infinite injury 
argument. We will see another in the proof of Corollary 9.10.24 below. There is a 
more direct separation that avoids injury altogether, due to Patey [244], who showed 
that EM does not imply AMT over RCAo. As remarked in Section 9.5, EM omits 
solutions of hyperimmune free degree, and the proof of this formalizes to show that 
EM implies OPT. 
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Along with OPT, Hirschfeldt, Shore, and Slaman [153] introduced a weak version 
of AMT known as the atomic model theorem with subenumerable types. Say two 
partial types [’ and I* are equivalent if for all formulas y, T+ yp oO I* + vy. A 
subenumeration of the types of a theory T is a listing (I; : i € w) of a class of 
partial types of T such that for every type I’ of T there is an 7 such that T' and I; 
are equivalent. Thus a subenumeration is a listing (up to equivalence) of the types 
of TI’, “hidden” among some other partial types. We say the types of a theory are 
subenumerable if such a subenumeration exists. 


Definition 9.10.13 (Hirschfeldt, Shore, and Slaman [153]). The atomic model 
theorem with subenumerable types (AST) is the following statement: every complete 
theory whose types are subenumerable has an atomic model. 


As stated, this is only interesting in the context of reverse mathematics, since clas- 
sically there is no content to a subenumeration existing or not. Alternatively, if we 
regard the associated problem form, then the subenumeration should be understood 
as being part of the instance. A computable instance of AST is therefore a complete 
atomic decidable theory, together with a computable subenumeration of the types 
of the theory. In particular, in such a theory every type is actually computable. By 
Theorem 9.10.5, this means that finding solutions to computable instances of AST is 
the same as finding listings of isolated paths of infinite computable trees U ¢ 2<@ 
with no dead ends, isolated paths dense, and all paths computable. 

Goncharov and Nurtazin [127] showed that AST is not computably true. On the 
other hand, we have the following result. 


Theorem 9.10.14 (Hirschfeldt [146]). Let U ¢ 2<® be an infinite computable tree 
with no dead ends, isolated paths dense, and all paths computable. Then every 
noncomputable set computes a listing of the isolated paths of U. 


Proof. Fix any noncomputable set, X. Let oo, 01, ... be a computable enumeration 
of the elements of U. For each n, we define a path f,, € [U] as follows. Let T,.9 = On 
and suppose we have defined t,,, € U for some s € w. Since U has no dead ends, 
either T,,50 or T,,51 must belong to U. If both do, call s a coding stage. Let k be 
the number of coding stages prior to s, and let T),s41 = T,s~ X(k). If s is not a 
coding stage, let T,,5+1 be whichever of T,,,;0 or T,,s 1 belongs to U. Let fr = Us Tn.s- 
Clearly, (fn : 1 € w) is uniformly computable in X and each f, is a path through 
U. Moreover, every isolated path through T must be some f,,. Indeed, if f € [U] is 
isolated by o;, then f,, = f. We claim that, in fact, each f, is isolated in U, whence 
it follows that (f, : m € w) is an X-computable listing of the isolated paths of U, as 
desired. Suppose not, and fix 7 to the contrary. Since f, € [U], it is computable by 
assumption on U. Hence, the sequence (T;,,; : 5 € w) is computable, and so is the 
set of coding stages in the construction of f,,. Now, f;, is not isolated, so the set of 
coding stages is infinite. Enumerate these as sg < s; < ---. Then for all k € w we 
have X(k) = Tn,s,(k), which means that X is computable, a contradiction. im 


Relativizing this argument, and formalizing it in RCAg, yields a computability theo- 
retic characterization of AST. 
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Theorem 9.10.15 (Hirschfeldt, Shore, and Slaman [153]). The following are 
equivalent over RCAo. 


1. AST. 
2. (VX)(AY)[Y £1 X]. 


This shows that AST is essentially the weakest possible principle that is not com- 
putably true, i.e., which does not hold in REC. (The only conceivably weaker such 
principles would have to be ones that hold in topped models, which are arguably not 
natural.) 

Since no hyperimmune set can be computable, it is clear that AST <,, OPT. This 
can be formalized in RCAg to get an implication. 


Proposition 9.10.16 (Hirschfeldt, Shore, and Slaman [153]). RCAg | OPT > 
AST. 


On the other hand, using Theorems 9.10.12 and 9.10.15 one can construct an w- 
model of AST in which OPT fails (Exercise 9.13.19.) 

There are other classes of models with interesting computability theoretic and 
reverse mathematics properties. Harris [138] undertook a similar study of saturated 
models, while Hirschfeldt, Lange, and Shore [150] studied homogeneous models. 


9.10.3 The finite intersection principle 


We now switch fields, and move from model theory to set theory. Our interest 
will be in a family of principles well-known from classical investigations of the 
axiom of choice. There has been extensive study, stretching back over a century, of 
various equivalents and consequences of the axiom of choice, relative to choice- 
free axiomatizations of set theory (e.g., ZF). The most famous of these, of course, 
are Zorn’s lemma and the well-ordering principle. But there are myriad others, 
from nearly every corner of mathematics, and more still if one looks at equivalents 
of various weaker forms of choice. A monumental catalog of such principles has 
been put together by Rubin and Rubin [261, 262], later expanded by Howard and 
Rubin [162]. 

The study of what happens to these equivalences in Z was initiated by Dzhafarov 
and Mummert [87, 88]. To be sure, to even express most of the principles studied 
alongside choice into £2 requires a significant miniaturization: principles concerning 
arbitrary sets now concern only sets of numbers. But this makes the situation all the 
more interesting. Stripped of the power that comes from being able to speak about 
sets in general, regardless of type or complexity, we are left with just the underlying 
combinatorics. 

The principles studied in [88] concern various intersection principles, most fa- 
mous among them the finite intersection principle. This (classically) asserts that 
every family of nonempty sets has a C-maximal subfamily with the so-called finite 
intersection property, meaning that any finite subfamily has nonempty intersection. 
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The finite intersection principle, and its variants for intersections of fixed size, are 
each equivalent to the axiom of choice over ZF (see [262], Chapter 4). In [88], these 
principles were miniaturized to second order arithmetic as follows. 


Definition 9.10.17. Let X = (X; :1 € w) and Y= (Y; : 1 € w) be families of sets. 


1. We write X € X if X = X; for some i € w. 

2: Xi is nontrivial if X; # @ for some i. 

3. Y =(¥j : i € w) is a subfamily of X if Y; € X for every I. 

4. Forn > 1,Y satisfies the n intersection property, or n-i.p., if for every finite set 
F Co of size n, (\icr Yi # ©. 

5. Y satisfies the finite intersection property, or f.i.p., if for every nonempty finite 
set F Cw, (\ier Yi # ©. 

6. If Y is subfamily of x satisfying n-i.p. or f.i-p., then Y is a maximal subfamily 
with this property if for every other subfamily Z of X with the same property, if 
Yisa subfamily of Z then Z isa subfamily of Y. 


Definition 9.10.18 (Dzhafarov and Mummert [88]). 


1. For n > 1, the n intersection principle (n|P) is the following statement: every 
nontrivial family of sets has a maximal subfamily satisfying n-i.p. 

2. The finite intersection principle (FIP) is the following statement: every nontrivial 
family of sets has a maximal subfamily satisfying f.i-p. 


The basic relationship between FIP and nIP is the following. 
Proposition 9.10.19 (Dzhafarov and Mummert [88]). For all m > n > 2, 
RCAg - FIP > mIP > alP. 


Because of the maximality condition, this does require a bit of argument. See Exer- 
cise 9.13.21. 

Most surprising about the strength of FIP is that it is closely related to the 
principles AMT and OPT of the previous section. This is in spite of the fact that the 
classical versions of the model theoretic principles are all provable in ZF, and are 
thus unrelated to the axiom of choice in any way. 

Dzhafarov and Mummert [88] showed that FIP follows from a principle called 
T19G, introduced by Hirschfeldt, Shore, and Slaman [153]. This is a kind of strong 
genericity principle, known by results of Conidis [51] to be w-model equivalent to 
AMT. It follows that over w-models, FIP is implied by AMT. Now, Hirschfeldt, Shore, 
and Slaman [153] showed that to formalize the equivalence between TG and AMT 
requires [xe But it turns out that this is unnecessary for the implication to FIP. 


Theorem 9.10.20 (Day, Dzhafarov, and Miller, unpublished; Hirschfeldt and 
Greenberg, unpublished). RCAg | AMT — FIP. 


Dzhafarov and Mummert [88] showed AMT <,, FIP, so also RCAg - FIP — AMT. 
But by contrast they also showed the following. 
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Theorem 9.10.21 (Dzhafarov and Mummert [88]). RCAg + 2IP — OPT. 


Proof (Downey, Diamondstone, Greenberg and Turetsky [65]). We exhibit a com- 
putable instance X = (X; : i € w) of 2IP, every solution to which computes a 
hyperimmune function. Formalizing this in RCAo and applying Theorem 9.10.12 
yields the result. 

We build X by stages. When we say that we “make” two sets X; and X; intersect 
at some stage s, we mean the following. 


¢ Add 2{7, s) to X; and 2(j,s) to X;. 
¢ Add 2(7, j, 5) + 1 to both X; and X;. 


It follows that if i # j then X; % X; will always be contained in the set of odd 
numbers. At stage 0, we make each of the following pairs of sets intersect. 


* Xa(e,x) and Xq(e,x+), for all e and all x # x". 
* Xa(e,x) and Xq(ex,x*), for all e # e* and all x and x*. 


At stage s, consider all e,x < s. If ®.(e)[s] | make Xp ¢) intersect X_(¢e*,.), for all 
e # e* and all x. If also ®,(x)[s] |, make Xj) intersect Xq(c,x). This completes 
the construction. Clearly, X is computable. 
_ Let Y= (Y; : i € w) be any maximal subfamily of ¥ satisfying 2-i.p. Each set in 
Y is nonempty, so given i we can Ze -computably find the least s such that 2(j, s) € Y; 
for some j. That is, we can Y-computably find the unique j such that ¥; = X;. Now, 
by construction and maximality, if Xp(e) é Y then Xace,x) € Y if and only if B(x) |, 
whereas if Xp (ce) ¢ Y then Xave, x) e ¥ for all x. 

We use this fact to define a Y- -computable function f, as follows. Fix x. Let i(x) 
be the least 7 € w such that for each e < x, at least one of the following holds. 


1. There is ani < j with ¥; = Xp,e). 
2. There is ani < j with ¥; = Xa(e,x+1)- 


So if (1) holds then ®,(e) |, and if (1) and (2) both hold then ®,(x + 1) | and so, 
by our use conventions, also ®,(x) |. Let f(x) be least number larger than ®, (e) if 
(1) holds for x = e; and larger than ®, (x) for all e < x for which both (1) and (2) 
hold. Clearly, the map x + i(x) is Y-computable, as is determining for each e < x 
which of (1) or (2) hold, so f is Y-computable. 

We claim that f is not dominated by any computable function. To see this, suppose 
®, is total. Then Xp,¢) belongs to y, as does Xq(e,x) for all x. Fix the least j such 
that Y; = Xp). If there is noi < j such that Y; = Xq(e,x) for some x > e then in the 
detiiton of f we must have i(e) > j, and therefore f(e) > ®.(e). Otherwise, fix 
the largest x > e such that Y; = Xq(e,x) for somei < j. In particular, there is noi < j 
with Y; = Xa(e,x+1). Hence, we must have i(x) > j, and therefore f(x) > ®e(x). O 


Thus, the set theoretic principles FIP and 2IP are sandwiched between the model 
theoretic principles AMT and OPT. Dzhafarov and Mummert [88] left open the 
question of whether OPT is actually equivalent to FIP (or 2IP). A number of degree 
theoretic upper bounds on the complexity of FIP were established in [88]. Downey, 
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Diamondstone, Greenberg, and Turetsky [65] observed that all of them are enjoyed 
by Cohen 1-generics, which led them to the following: if X <; 9’, then X computes 
a solution to every computable instance of FIP if and only if it is 1-generic. (As it 
turns out, FIP admits a universal instance, so this is the same as asking about a single 
computable instance of FIP. See Exercise 9.13.20.) Remarkably, Cholak, Downey, 
and Igusa [27] showed that 1-genericity also characterizes the strength of FIP outside 
the @’-computable degrees. 


Definition 9.10.22 (Existence of 1-generics). 1-GEN is the following statement: 
(VX) (AY) [Y is Cohen 1-generic relative to X]. 


Theorem 9.10.23 (Cholak, Downey, and Igusa [27]). 


1. For all n > 2, FIP =. nIP =, 1-GEN. 
2. RCAg + FIP — 1-GEN 
3. For n > 2, RCAg + IZ) + nIP © 1-GEN. 


Thus, yet again we see a computability theoretic characterization of a noncom- 
putability theoretic principle. Using the characterization, it is possible to answer the 
question from [88] and separate FIP from OPT. The proof uses a surprisingly strong 
degree theoretic fact. 


Corollary 9.10.24. OPT <,, FIP. 


Proof (Hirschfeldt, personal communication). Epstein [94] (see also [1], p. 27) con- 
structed an initial segment of the N Turing degrees of order type w. That is, 
there are sets @ = Ag <7 A, <7 -:: <7 @’ such that if Y <y; A; for some i 
then Y = A; for some j (necessarily j < i). Thus, if we let S be the w-model 
{Y € 2® : (ad[Y <q A;]} then S = {A; : i € wh. Now, for each i we have 
Ai <r Aisi <1 @’ <z Aj,,. Relativizing Theorem 2.8.20 to A;, we see that Aj+; 
is hyperimmune relative to A;. By Theorem 9.10.12 this implies that S & OPT. 
However, no A; can be |-generic. Indeed, if we write A; as By © B, then Bo and B, 
are both computable from A;, hence each is equal to A; for some j. It follows that 
either By <y By or By <y Bo. But by Exercise 7.8.1, the odd and even halves of a 
1-generic are Turing incomparable. By Theorem 9.10.23, S # FIP. oO 


The proof of Theorem 9.10.23, while self-contained in terms of methods, is fairly 
involved and would take us a bit too far afield. One interesting observation made 
in [27] is that the proof seems to have an essential nonuniformity. This raises the 
following question. 


Question 9.10.25. Is it the case that 1-GEN <w FIP? 


Another question about the strength of intersection principles, left over from the 
initial investigation in [88], is whether FIP and 2IP are equivalent over RCAo. Theo- 
rem 9.10.23 allows us to recast this in terms of genericity. 


Question 9.10.26. Is it the case that RCAp + 2IP — 1-GEN? 
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9.11 Weak weak K@6nig’s lemma 


The final combinatorial principle we consider in this chapter is actually a bit of an 
outlier among those we have been looking at so far. It is not related in any way to 
RT, but rather to WKL. This is the so-called weak weak Konig’s lemma. 


Definition 9.11.1 (Weak weak K6nig’s lemma). 
1. WWKL is the following statement: for every T C 2<@ such that 


€2”:c06€T 
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7 
r an 0, (9.7) 


there exists an f € [T]. 
2. WWKL5 is the £2 theory consisting of RCAg plus WWKL. 


As asubsystem, WWKLo was first introduced by Yu and Simpson [330]. It was further 
developed by Brown; Giusto, and Simpson [23], and by Giusto and Simpson [124]. 
Note that the ratios |{o € 2” : o € T}|/2” are nonincreasing in n and bounded 
below by 0, so (classically) the limit above always exists. The starting observation 
here is the following. Let yu be the “fair coin” measure on Cantor space defined in 
Section 9.4. Because 
[T]=( \[toe2":o TH. 


n 


and because [[{o € 2”: 0 € T}]] 2 [[{o € 2"*! : o ¢ T}]] for all n, we have 


M([T]) = lim w([[{o € 2": o € TH) =lim2™ - [{o € 2": 0 €T}I. 


Hence T satisfies (9.7) if and only if u([7]) > 0. For this reason, at least when 
working over w-models, we can think of the instances of WWKL as interchangeable 
with m1 classes of positive measure. (For this reason, some authors also refer to trees 
satisfying (9.7) as “trees of positive measure’) 

It turns out that m1 classes of positive measure have interesting connections to al- 
gorithmic randomness. Recall the definition of a 1-random set from Definition 9.4.5. 


Theorem 9.11.2 (Kuéera [192]). Let C be a m1 class and X a 1\-random. If 
L(C) > 0 then C has an X-computable member. In fact, TX € C for some t € 2°”. 


Proof. Fixacomputable tree T with [T] = C. We may assume T # 2“, as otherwise 
the result is obvious. Set 


Uj ={70 €2°° :c €TA(Vt<o)[TET]}, 
and for each n > 0, let 
Un = {o € 2°” : (Ar € Un-1)(Ap € Up) lo = tp]}. 


Then the sequence (U,, : n € w) is uniformly c.e. Moreover, it follows by induction 
that each U,, is a prefix free set of strings. So, 
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Inductively, u({[Un]]) = uC [Uol])"*! for all n. Since 0 < p([[Uo]]) < 1 by as- 
sumption, we can uniformly computably extract a sequence ng < nm; < --- such that 
H(LUn, I) < 2-* for all k. Now if X is 1-random, there must consequently exist a 
k such that X ¢ [[U;, ]]. Pick the least such k. If k = 0, then X € [T] by definition. 
Suppose k > 0. Then X € [[Ux_1]], so necessarily t < X for some t € Ux_1. Define 
f € 2® by f(x) = X(x + |r|). Clearly, f <r X, and we claim that f € [T]. If not, 
then some p < f belongs to Uo. But then tp < X, witnessing that X € [[U<]], a 
contradiction. oO 


Corollary 9.11.3. Let S be an w-model. Then S & WWKL if and only if for every 
A € S there exists an X € S such that X is 1-random relative to A. In particular, 
there is an w-model of RCAg + =WWKLo. 


Proof. First, suppose S & WWKL. Fix any A € S. Relativizing Theorem 9.4.6, let 
(Uy, : n € w) be a universal Martin-Lof test relative to A. Let C} = 2° \ (. Then 
C; isa ai class since f € C, if and only if (Vk)[f [k ¢ U;] for the A-c.e. set 
U, © 2° for which UY, = [[U;]]. So there is some A-computable tree T € 2“ with 
C; = [T]. In particular, T € S. Furthermore, u(C,) = 1 — w(t) > 1-27! = s, so 
T is an instance of WWKL. Therefore, there exists X € SQ [T]. Since X ¢ (),, MU), 
X is 1-random relative to A. 

Now suppose that for every A € S there exists an X € S such that X is 1-random 
relative to A. Let T € S be any instance of WWKL. Then C = [T] is a Sia class 
of positive measure. Let X be any set 1-random relative to T such that X € S. By 
Theorem 9.11.2, relativized to T, there is an X-computable f € C. As S is closed 
under <7, we have f € S. Since T was arbitrary, we conclude that S — WWKL, as 
desired. 

For the second part, since no 1-random can be computable, REC ¥ WWKLo. Oo 


Corollary 9.11.4. If X is 1-random, then there is an w-model of WWKL consisting 
entirely of X-computable sets. 


Proof. By Theorem 9.4.18, let S be an w-model consisting entirely of X-computable 
sets, and such that for every Y € S there is a Y* € S which is 1-random relative to 
Y. By Corollary 9.11.3, S = WWKL. Oo 


How does WWKL compare to WKL (or WKLo to WWKLo)? Obviously, every my 
class of positive measure is, a fortiori, nonempty. Hence, WWKL is a subproblem of 
WEL. But in fact, the extra “W” is deserved, and WWKL is indeed a weaker principle. 
To see this, we will use the following result of Stephan. We omit the proof, which can 
be readily found in other texts (e.g., Downey and Hirschfeldt [83, Theorem 8.8.4]). 


Theorem 9.11.5 (Stephan [303]). Jf X is 1-random and of PA degree then X >y @’. 


Corollary 9.11.6 (Simpson and Yu [330]).. WKL <,, WWKL. Hence, WWKLo ¥ 
WKLo. 
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Proof. By Corollary 9.4.7, there is a nonempty m1 class consisting entirely of 1- 
random sets. Using the cone avoidance basis theorem (Theorem 2.8.23), we can 
therefore find a l-random set X #7 @’. Apply Corollary 9.11.4 to find an w-model 
S of WWKL consisting entirely of X-computable sets. Now, no Y € S can have 
PA degree. Otherwise, any Y* € S which is 1-random relative to Y would be both 
1-random and of PA degree, and so would compute @’ by Theorem 9.11.5. But @’ is 
not computable from X, and therefore not in S. By Proposition 4.6.3, we conclude 
that S # WKL. Oo 


Another striking contrast between WKL and WWKL is demonstrated with strong 
omniscient computable reducibility, <,o¢, which was introduced in Definition 8.8.9. 


Theorem 9.11.7 (Monin and Patey [217]). 


1. KL <go¢ WWKL. 
2. KL <<oc WKL. 


Proof. For part (1), suppose we are given an instance T € w< of KL. Thus T is an 
infinite, finitely-branching tree, so we may pick some f € [T]. Let S = {yy Tk: 
k € w}, where yy is the characteristic function of f. Then S is an instance of WKL 
whose only solution is y¢, which computes /f. 

For part (2), let T C w<® be any instance of KL with a single noncomputable 
path, f (e.g., as in the proof of Proposition 3.6.10.) By a famous result of Sacks 
(see [74, Corollary 8.12.2]), the collection of sets that compute f has measure 0. 
Hence, every instance of WWKL must have a solution that does not compute f. So, 
T witnesses that KL €s9¢ WWKL. oO 


A similar argument to part (2) above yields a perhaps even more surprising result. 


Theorem 9.11.8 (Hirschfeldt and Jockusch [148]). RT, <soc WWKL. 


Proof. Let S be a bi-hyperimmune set (That is, S and S are both hyperimmune. The 
construction of such a set is a standard exercise.) Let c: w — 2 be the characteristic 
function of S, regarded as an instance of RT}. Every RT5-solution to c is therefore 
an infinite subset of either S or S. By Theorem 9.4.21, the collection of all sets that 
can compute an infinite subset of a given hyperimmune set has measure 0. Hence, 
the collection of sets that can compute a solution to c has measure 0, too. It follows 
that every instance of WWKL has a solution that does not compute a solution to c.O 


WWKEL is not entirely without strength, however. The following result is an im- 
mediate consequence of Kuéera’s theorem (Theorem 9.4.14). We include the proof 
in order to show how it is formalized. 


Corollary 9.11.9 (Giusto and Simpson [124]). WWKLo + DNR. 
Proof. We argue in RCAp. Let A be any set. Define the following tree: 


T={o €2N: We <|o)f[e> 1A Be(e) le 2% — O4(e) ec fe]}. 
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Then that for alln > 1, if o ¢€ T but o [n—1 € T it must be that 0 = ®,(n). In 
particular, 7 is unique with this property. Since T contains both strings of length 1, 
it thus follows by induction that for all n > 1, 


l{o €2”:ac€T}| , 2-Hoe 2": oeT}-1 


2” 2n 
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It follows that T is an instance of WWKL. Apply WWKL to obtain some X ¢€ [T]. Then 
by definition, X fe # ®.(e) forall e > 1. Define a function f so that f(0) # ®o(0), 
fC) # (1), and f(e) = X fe for all e > 1. Then f is DNC relative to A. Since 
A was arbitrary, the proof is complete. oO 


Using a fairly sophisticated combinatorial argument, Ambos-Spies, Kjos-Hanssen, 
Lempp, and Slaman [2] showed that the above is actually a strict implication. We 
omit the proof. Thus, WWKLo is intermediate between WKLo and DNR. 


Theorem 9.11.10 (Ambos-Spies, Kjos-Hanssen, Lempp, and Slaman [2]). There 
is an w-model of DNR + ~WWKLo. In particular, WWKLo ¥ DNR. 


An interesting question is whether the strength of WWKL is in any way changed 
if we look not just at iB classes of (arbitrary) positive measure, but some prescribed 
amount of measure instead. To this, let us define the following class of restrictions 
of WWKL. 


Definition 9.11.11 (Dorais, Dzhafarov, Hirst, Mileti, and Shafer [72]). Fix g € Q 
with 0 < g < 1. g-WWKL is the following statement: for every T C 2“ such that 


e2":c0€T 
lim HEMET Sg, 


there exists an f € [T]. 


So if 0 < p < q < 1 then g-WWKL is a subproblem of p-WWKL, which in turn 
is a subproblem of WWKL. It turns out that over RCAo, all of these principles are 
equivalent. 


Theorem 9.11.12 (Dorais, Dzhafarov, Hirst, Mileti, and Shafer [72]). Fix p,q € 
Qwith0<p<q<l. 


1. p-WWKL =e g-WWKL. 
2. RCAy F g-WWKL © WWKL. 
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Proof. For part (1), we need only show that p-WWKL <sc q-WWKL. Fix any tree 
T ¢ 2° with u([T]) > p. Let (YU, : n € w) bea universal Martin-Lof test relative 
to T, and let i be least so that 2~! < 1- q. Then 2° \ UY; isa a? class, hence there 
is a T-computable tree S C 2<® with [S] = 2° \ U;. By choice of i, u([S]) > g. 
Since every X € [S] is 1-random relative to T, it follows by Theorem 9.11.2 that 
TX € [T] for some t € 2%. In particular, X computes an element of [7]. 

For part (2), we note that the above argument can be readily formalized in RCAo. 
We then argue as follows. Given an instance T of WWKL, fix a positive p € Q so that 


lim UE ee : 


Then, apply the above argument. This shows that g-WWKL implies WWKL, hence 
q-WWKL © WWKL. Oo 


But there is a detectable difference under finer reducibilities. 


Proposition 9.11.13 (Dorais, Dzhafarov, Hirst, Mileti, and Shafer [72]). Fix 
D.g € Qwith0O < p <q <1. Then p-WWKL ¢w g-WWKL. 


Proof. Fix Turing functionals ® and Y. Uniformly computably in indices for ® 
and we build a computable tree T C 2“ witnessing that it is not the case that 
p-WWKL <sw g-WWKL via ® and ¥. From here, it follows by Proposition 4.4.5 that 
p-WWKL ¢w g-WWKL. 

The construction of T is in stages. At stage s, we define a finite tree T, C 2<* 
containing at least 2° p many strings of length s. We assume without loss of generality 
that ®(7,) is always a finite subtree of 2<”, and we let ns be least such that 
@(T,) ¢ 2<”"s, By our usual use conventions, we may also assume n, < s for 
convenience. 

Let k € w be least so that 2-* < gq, and let m € w be large enough so that 
1-2°"k 2p. 


Construction. At stage 0, let Tj = {()}. Now suppose we are at stage s + 1, and that 
T; has already been defined. Let x9 < -+- < x,—1 be the least numbers that we have 
not yet acted for in the construction, as defined below. We consider two cases. 


Case 1: The following hold. 


Ilm<s. 

2. kp <8: 

3. ns > 0. 

4. B" (x;) |< 2 for allo € ®(T;) of length ns and alli < m. 
5. |{o € 2": 0 € O(T,)}| > 2"°q. 


Define p € 2” as follows. If at least half of all o € ®(7,) of length n, satisfy 
W° (x;) = 0, let p(i) = 0. Otherwise, let p(i) = 1. (In this case, of course, at least 
half of allo € ®(T,) of lengthn, satisfy P? (x;) = 1.) We now let 7; be T, together 
with a0 and o1 for every o € T; of length s for which (Ai < m)[o(x;) # p(a)]. In 
this case, say we have acted for x9, ...,Xm-1.- 


354 9 Other combinatorial principles 


Case 2: Otherwise. In this case, we let T;41 be T; together with 70 and o1 for every 
ao ET, N2°. 


Verification. Let T = , T;. Clearly, T is a tree, uniformly computable in ® and YP. 
We claim that u([T]) > p, and that if ®(T) is an infinite tree, and if ¥(f) € [T] for 
every f € [®(7)], then uw([®(T)]) < g. 

In the construction, the measure of [T] is only ever reduced if we enter Case 1. At 
each such stage, at most 27” |{o € T; : |o| = s}| many strings are not extended into 
T;s+1, So the eventual measure of [7] decreases by at most 2~”. At the same time, the 
eventual measure of [®(7)] is halved. 

Suppose p([T]) < p. Fix the least s such that T,,; contains fewer than 2°+! p 
many strings of length s. In particular, the construction must have entered Case 1 
at stage s. Say there are a total of 7 > 1 many stages, up to and including s + 1, at 
which the construction enters Case 1. Then we have 


, ae al eo € Tea} 4 


Qst1 —2 j. 
By choice of m, we must have j > k. But then by choice of k, we have 


[{o € 2" so € O(T),)}| 
2M 


e272 2g. 


Thus, property (5) fails at stage s + 1, so the construction cannot enter Case | at this 
stage after all. This is a contradiction. We conclude that ([T]) > p. 

Now suppose ®(T) is infinite and that ¥(f) € [TJ] for all f € [®(7)]. For fixed 
choice of x9 < +++ < Xm-1, it follows by assumption on ® and ¥ that properties 
(1)-(4) of Case 1 hold for all sufficiently large s. Thus, the construction must enter 
Case | at some stage: if it did not then property (5) would always hold, so Case 1 
would apply at all sufficiently large stages. At the same time, since u([T]) > p, the 
construction can enter Case | at only finitely many stages s. Therefore, we can let 
so be the largest stage at which the construction enters Case 1. Now fix the least 
xo < +++ <X ~1 not yet acted for at, or prior to, stage sg. Let s > sg be large enough 
so that properties (1)-(4) of Case 1 hold. Then since the construction does not enter 
Case | at stage s + 1, it must be that property (5) fails. This forces u([®(T)]) < q, 
as claimed. im 


9.12 The reverse mathematics zoo 


The collection of inequivalent principles below (or around) ACAg has come to be 
called the reverse mathematics zoo. The proliferation of these principles has made 
understanding the increasingly complex web of relationships between them much 
more difficult. Diagrams help. We have already seen several, in Figures 8.2, 9.1, 
9.2 and 9.4. This style of diagram, with arrows indicating implications/reductions, 
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and double arrows indicating those that are strict, was introduced by Hirschfeldt and 
Shore [152], and is now used widely in the reverse mathematics literature. 

Typically, such diagrams are included in a paper only to showcase the principles 
of interest there. Figure 9.5 displays a much larger selection of principles, assembled 
from across the reverse mathematics literature. One gets a sense from this of just how 
massive the reverse mathematics zoo has gotten over the years—the figure includes 
204 principles and combinations of principles. In addition, it becomes clear that 
diagrams of this sort have limited utility: when too many principles are included, 
they are really quite useless. More generally, keeping track of what is known and 
what is not in the subject is quite unwieldy with such a large body of work. 

Of course, this is not a problem unique to reverse mathematics. “Zoos” exist in 
many areas, from complexity theory to algorithmic randomness. Sometime prior 
to 2010, Kjos-Hanssen assembled a large diagram of downward-closed classes of 
Turing degrees, ordered by inclusion. Subsequently, Miller wrote an interactive 
command-line tool to make this information easier to work with. Starting with a 
database of facts, the program was able to derive new facts from old by transitive 
closure and display open questions. This was later adapted for reverse mathematics, 
and more specifically for weak combinatorial principles, by Dzhafarov. Here, one 
complication is that in addition to implications one has to deal with joins of principles, 
which makes closing off under what is known slower. The current version of this 
tool has been rewritten and optimized by Astor, and now also supports a variety of 
finer reducibilities, including <,, <w, and <,,. (It was used to generate Figure 9.5.) 

The reverse mathematics zoo tool is a simple Python script and can be down- 
loaded at rmzoo.uconn.edu. More recently, Khan, Miller, and Patey have written 
a visualizer for the tool, which also supports databases from areas other than re- 
verse mathematics. The visualizer runs in a web browser, and can be accessed at 
computability.org/zoo-viewer. 


9.13 Exercises 


Exercise 9.13.1. Show that RCAo proves the following: if (P, <p) is a partial order, 
there exists a linear ordering <, of P that extends <p. That is, x <p y > x <1 y, 
for all x, y € P. 


Exercise 9.13.2. 


1. Let (P, <p) be a computable stable partial order, say with every element either 
small or isolated. Show that if S C P is infinite and every element in it is small 
(respectively, isolated), then there is an S-computable infinite set S* C S which 
is an chain (respectively, antichain) for <p. 

2. Let (L, <_) be a computable stable linear order. Show that if S € P is infinite and 
every element in it is small (respectively, large), then there is an S-computable 
infinite set S* C S which is an ascending sequence (respectively, descending 
sequence) for <p. 
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Figure 9.5. A snapshot of the reverse mathematics zoo, indicating the myriad principles below 
ACAo. Arrows indicate implications over RCAg; double arrows are implications that cannot be 
reversed. 


358 9 Other combinatorial principles 


Exercise 9.13.3. Show that WSCAC <w SRT. 


Exercise 9.13.4. For each of the properties below, construct a Martin-Léf test (U,, : 
n € w) such that a set X € 2 satisfies the property if and only if X does not pass 
the test. 


1. X(n) = 0 for all even n. 
2. The elements of X form an arithmetical progression. 
3. If X = Xo ® X; then Xp <p Xj. 


Exercise 9.13.5. 


1. Show that if Z/ C 2 is open then there exists a prefix free set U € 2<® such 
that U = [[U]] = Ucevlloll. 

2. Show that if U,V ¢ 2<® are both prefix free and [[U]] = [[V]] then 
dioet 2 lel = dicev 2 lel, 


Exercise 9.13.6. Prove Theorem 9.4.6 and Corollary 9.4.7. (Hint: For Corol- 
lary 9.4.7, take a universal test (UW, : n € w) and consider the class 2° \ MY.) 


Exercise 9.13.7. Prove Proposition 9.4.10. 


Exercise 9.13.8 (Dorais, Dzhafarov, Hirst, Mileti, and Shafer [72]). Show that 
RT} <w RATS. 


Exercise 9.13.9. Prove Theorem 9.5.7. 
Exercise 9.13.10. Prove Lemma 9.9.4. 


Exercise 9.13.11. For k > 1, let HT, be the restriction of HT to coloring k-colorings. 
Prove that for all k > 2, RCAg K HT2 © HT x. 


Exercise 9.13.12. Prove Theorem 9.9.33. 


Exercise 9.13.13. Prove the following in RCAp. If T ¢ 2<N andT = 2<%, then there 
exists an infinite sequence (0; : i € N) with oj < oj4; and o; € T for all 7. (We 
might call such a sequence a “path” through 7, even though T is not formally a tree.) 


Exercise 9.13.14 (Dzhafarov, Hirst, and Lakins [84]). Fix k > 1 andc: [ae]? > 
k. Say that c is 


¢ 1-stable if for every 0 € 2<“® there exists i < k andn > |o| such that c(a, tT) =i 
for all tT > o with |t| > 7. 

* 2-stable if for every o € 2“ there is an n > |o| such that for every extension 
t > o of length n, c(o, p) = c(a,T) for every p > T. 

* 3-stable if for each o € 2“ there exists i < k such that for every o* > o there 
exists T > o* with c(o, p) =i for all p > T. 

¢ 4-stable if for each o € 2<@ and each o* > a, there exists T > o* such that 
c(a, p) = c(o,T) for all p > T. 


9.13 Exercises 359 


° 5-stable if for every o € 2<® there is a o* > o such that c(o,T) = c(o, a") 
for all tT > o*. 

* 6-stable if for every o € 2<® we can find ao* > o and ani < k such that for 
all subtrees T extending o* which are isomorphic to 2“, there is a t € T such 
that c(o,T) =1. 


— 


. Prove in RCAg that 
cis 1-stable — c is 2-stable — c is 4-stable — c is 5-stable 


and 
cis 1-stable — c is 3-stable — c is 4-stable — c is 5-stable. 


2. Prove in RCAg that c is 5-stable if and only if it is 6-stable. 

. Prove in RCApg that c is 1-stable if and only if it is both 2-stable and 3-stable. 

4. Fixi € {1,...,6} and let Sir, be the restriction of 1; to i-stable colorings. 
Prove that SRT? <w S'TI:. 


ioe) 


Exercise 9.13.15. Prove that there exists a computable map f: w x 2<° — 2 with 
the following property: for each n € w and e < n, if ®, is a computable set T = 2<@ 
then there exist incomparable nodes oo, 0; € T such that for each i < 2, f(n,T) =i 
for all t € T extending o;. 


Exercise 9.13.16. Complete the proof of Theorem 9.8.3. 


Exercise 9.13.17. Fix n,k > 1. 


1. Show that TT; is identity reducible to MTT*,. 

2. Find an m > 1 such that TT? <w MTT> *---* MTT, where the compositional 
product on the right is taken over m many terms. 

3. Show that RCAg + MTT) — TTY. 


Exercise 9.13.18. Formulate a notion of stability for instances of PMTT* with the 
following properties. Fix a set A and numbers d, k > 1. Let To,...,Ta-1 G w<@ be 
rooted, meet-closed, finitely branching, with height w and no leaves. 


1. If c: So(T,...,Ta-1) — k for some k > 1 is a stable A-computable coloring 
then there is an A’-computable coloring d: S,(To,...,Ta—1) — k such that for 
all (So,...,Sa-1) € Sw(To,..., Ta-1), dis constant on S;(So,..., Sa—1) if and 
only of c is constant on S2(So,..., Sa). 

2. Ifc: S\(Tp,...,Ta-1) — k for some k > 1 is A’-computable then there is an A- 
computable coloring d: S2(To,...,Ta—1) — k such that for all (So, ...,Sa—1) € 
S(T, .--,Ta_-1), dis constant on S| (So,..., Sg_1) if and only of c is constant 


on S2(So, ieee »Sq-1): 


Exercise 9.13.19. Use Theorems 9.10.12 and 9.10.15 to show that OPT <,, AST. 
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Exercise 9.13.20 (Downey, Diamondstone, Greenberg and Turetsky [65]). 
Prove each of the following. 


1. FIP admits a universal instance. 
2. For each n > 2, nIP has a universal instance. 


Exercise 9.13.21. Prove Proposition 9.10.19. 


Exercise 9.13.22. Show that WKL <.w WWKL. 


Part IV 
Other areas 


Chapter 10 ®) 


Check for 


Analysis and topology cpa 


Analysis and continuous mathematics have been a topic in logic and effective math- 
ematics throughout the history of the field. While the foundations of arithmetic have 
sometimes been of less interest to “working mathematicians”, the foundations of real 
analysis have traditionally been viewed as more central to mathematical practice. 

Work on formalizing and arithmetizing the real line dates back to Dedekind and 
Cantor, and the real line has been studied by virtually every foundational program 
since. Notable examples include Brouwer’s intuitionistic continuum and Weyl’s work 
on predicativism. Feferman’s later work with higher order logic is a direct influence 
on reverse mathematics. Unsurprisingly, analysis received heavy attention in reverse 
mathematics, especially during the 1980s and 1990s. 

At the same time, continuous mathematics presents a particular challenge for 
reverse mathematics because of the coding required. With countable combinatorics 
and countable algebra, the coding required to express theorems in second order 
arithmetic is often straightforward. Continuous mathematics unavoidably involves 
uncountability, however, even when we focus on fundamental spaces like the real 
line. 

In countable combinatorics and algebra, we can code the individual points or 
objects as natural numbers. With uncountable spaces, we typically code each object 
(for example, each real number) with a set of natural numbers. This allows us to code 
spaces whose set of objects has the cardinality 2*°, the size of complete separable 
metric spaces. This allows us to formalize results about spaces of this size, including 
many theorems of real and complex analysis. The cost is that additional work is 
required simply to verify properties of the coding systems employed. Moreover, the 
entire space is not represented as a single object in second order arithmetic. Thus 
we can quantify over points of a given space, using a set quantifier, but our ability to 
quantify over all spaces is more limited. 

We begin this chapter with a collection of results to illustrate the key methods 
and phenomena, especially related to subsets of R and functions on the real line. 
(We will omit proofs that are straightforward or standard.) We then turn to some 
results that are less well documented in the secondary literature, including theorems 
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from ergodic theory and topological dynamics. We finish with a survey of work on 
formalizing non-metric topological spaces. 

The results we study in this chapter can be particularly hard to attribute to a specific 
individual or group. As we have mentioned, the basic properties of the real line and 
complete separable metric spaces have been studied by many foundational programs, 
including constructive and computable analysis, since the 19th century. Some of 
these theorems were among the initial examples of reverse mathematics presented 
by Friedman [108]. Other results are incremental strengthenings of previously known 
results (for example, generalizing from [0, 1] to compact complete separable metric 
spaces). As such, it can be challenging to completely describe the provenance of 
these theorems, especially reversals based on computable counterexamples that may 
date back to the mid 20th century. We have attempted to provide references with 
more detailed treatments of many results, but these should not always be taken as 
definitive. 


10.1 Formalizations of the real line 


There are several well-known ways to “arithmetize” the real line. The traditional rep- 
resentation of real numbers in reverse mathematics uses quickly converging Cauchy 
sequences. In this section, we explore some aspects of this formalization that make 
it especially suitable for our purposes. 

We will represent a real number as the limit of a Cauchy sequence of rationals. 
As usual, two Cauchy sequences (g,) and (r,) have the same limit if for every 
é € Q there is an N such that |g; — r;,| < € whenever m,n > N. However, two 
Cauchy sequences of rationals with the same limit may not provide the same effective 
information about that limit. In order to extract information about the limit from the 
terms of the Cauchy sequence, we need to have some bound on how far the terms 
can be from the limit, in terms of their indices. The following definition is made 
in RCAo. 


Definition 10.1.1 (Quickly converging Cauchy sequence). A sequence (g,) of 
rational numbers is quickly converging if |qn — dm| < 2” whenever n > m. A real 
number is defined to be a quickly converging sequence of rationals. 


The choice of 2~”” in the previous definition is arbitrary (see Exercise 10.9.1). The 
spirit of the definition is that, in order to approximate the limit to within e = 2, 
we only need to look at term n + 1 of the quickly converging Cauchy sequence. 
Of course, there are still many quickly converging Cauchy sequences for each real 
number, so we define equality of reals via an equivalence relation. 


Definition 10.1.2 (Equality of Cauchy sequences). Two real numbers x = (g,) and 
y = (rn) are equal if their Cauchy sequences have the same limit. 


The real numbers are thus treated as a setoid, with a distinction between intensional 
equality of Cauchy sequences and extensional equality of the real numbers they 
represent. 
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There are many other representations of real numbers, including Dedekind cuts, 
“unmodulated” Cauchy sequences without a known modulus of convergence, and 
signed binary expansions (see [325]). There are several reasons for choosing quickly 
converging Cauchy sequences. One key reason relates to effectiveness, as shown in 
the next two lemmas. A second reason is that the definition via quickly converging 
Cauchy sequences generalizes directly to arbitrary complete separable metric spaces, 
unlike Dedekind cuts or binary expansions. This allows the real line to be handled 
as a special case in theorems about complete separable metric spaces. 


Lemma 10.1.3. When the real numbers are represented by quickly converging 
Cauchy sequences, the following hold. 


1. The relation x < y on real numbers can be represented with a y formula. 
2. The relations x < y and x = y on real numbers can be represented with nm 
formulas. 


Lemma 10.1.4. When the reals are represented by quickly converging Cauchy se- 
quences, the operations x + y, x — y, x X y, and x/y when y # 0 are uniformly 
computable by Turing functionals with oracles for x and y. 


Neither of these lemmas holds for the representation with Dedekind cuts, or the 
representation with unmodulated Cauchy sequences. Hirst [158] gives a detailed 
study of representations and the effectiveness of conversions between them. 

Beyond the real line, we can study complete separable metric spaces with es- 
sentially the same framework. We begin with a countable dense set equipped with 
a metric, and take the quickly converging Cauchy sequences with an equivalence 
relation. The following definition is straightforward to formalize in RCAo. 


Definition 10.1.5 (Coding for complete separable metric spaces). A complete 
separable metric space A is coded with a set A C w anda function d: Ax A > R* 
that is a metric on A. 

A point in A is a quickly converging Cauchy sequence (a; : i € w) of elements 
of A. Per our conventions, this means that d(a,,,a,) < 27 whenever n > m. 

Each a € A is identified with a point ain A via a constant Cauchy sequence. We 
often write a for @ when there is little chance of confusion. 


In particular, RCAg is strong enough to prove that R, R” forn € N, 2%, and NN are 
complete separable metric spaces. The next lemma is a key fact about the complexity 
of the metric on a complete separable metric space. 


Lemma 10.1.6. Let A be a complete separable metric space. For x,y € A and 
r € @, the relation R(x, y,r) = [d(x, y) < r] can be expressed with a x? formula. 
The relation E(x, y) = [x = y] can be expressed with a mm formula. 
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10.2 Sequences and convergence 


In this section, we look at some of the most basic theorems in real analysis, each 
relating to the convergence of sequences. 

The definition of a Cauchy sequence in a complete separable metric space A= 
(A, d) can be directly translated into RCAo. The terminology in the next definition 
follows the convention in which a point “exists” when it is an element of the model 
being considered. 


Definition 10.2.1 (Convergence of Cauchy sequences). A Cauchy sequence (ay) 
in acomplete separable metric space A converges if there is a point / in A (that is, a 
quickly converging Cauchy sequence of points in A) with lima, = 1. 


RCAg proves that every quickly converging Cauchy sequence (of arbitrary points) 
in a complete separable metric space converges (see Exercise 10.9.2). If we do not 
require the sequence to be quickly converging, however, we will see that ACAo is 
required to find the limit. 


Lemma 10.2.2. ACAo proves that every Cauchy sequence in a complete separable 
metric space converges. 


Proof. Let (x;) be a Cauchy sequence in a complete separable metric space A= 
(A, d). Use arithmetical comprehension to form the set S of all pairs (a,r) € A x Q* 
for which the sequence (x;) is eventually in the ball B(a, r), that is, 


{a,r) € S & (AN)(Vi > N)(Vk)[d(4j, a) <r]. 


Because (x;) is a Cauchy sequence, for each r € Q* there is at least one a with 
(a,r) €S. 

The remainder of the proof is effective relative to the oracle S$. Working from 
a fixed enumeration of A, we can effectively produce a sequence a; such that 
(aj,2~“*") € S for all i. We claim that (a;) is a rapidly converging Cauchy se- 
quence. Given j > i, we can choose k large enough that d(xx,a;) < 27+” and 
d(xx,aj) <2-4*) < 2-@)_ Hence d(aj,aj) < 27. Qo 


Theorem 10.2.3. The following are equivalent over RCAo. 


I. ACAo. 

2. Every Cauchy sequence in a complete separable metric space converges. 
3. Every Cauchy sequence of real numbers converges. 

4. Every bounded sequence of real numbers has a convergent subsequence. 
5. The restriction of (3) or (4) to sequences in [0, 1]. 


Proof. Lemma 10.2.2 shows that ACAg implies (2), which trivially implies (3). 
RCAp proves every Cauchy sequence is bounded, so (3) implies (4). Conversely, 
RCA proves that if a Cauchy sequence (a,) has a convergent subsequence then 
(dn) converges to the limit of that subsequence, so (4) implies (3). Because we can 
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scale functions in RCA, both (3) and (4) are equivalent to their restrictions to [0, 1]. 
Therefore, it is sufficient to assume that every Cauchy sequence of real numbers 
converges, and prove ACAo. 

Let f: N — N be given. We will show that the range of f exists, which is 
sufficient by Theorem 5.7.2. For each m, let A,, be the set of n < m which are 
in the range of f fm. Let gm = Nica, 2-2) Intuitively, thinking of the binary 
expansion of g;, we use the odd-numbered bits to track values that have entered the 
range of f by stage m. These odd bits cannot interact because they are spaced apart 
with the even-numbered bits. 

We first prove in RCAg that (q,,) is a Cauchy sequence. Given € > 0, choose k 
with 2-* < «. By xt bounding there is an r such that for all n < k, if n is in the 
range of f then v is in the range of f [| k. This means that the bits corresponding to 


0,..., k will not change their value after stage r. In particular, for i > r, we have 
co co 
lgi-9rl < » g Vite » DI=2* eg. 
j=k+l j=k+1 


Now, suppose that we have a quickly converging Cauchy sequence (a;) that 
converges to the same point x as (qm). Given n, to determine if n € range(f), we 
can simply consult a2,,+2. It follows from construction that the first 2n+ 1 bits of dan+2 
must agree with the first 2n + 1 bits of x. Otherwise, there would be some k > 2n+2 
with d(ax, d2n42) > 2-2"*!, which contradicts the choice of (a;). Therefore, we have 
that n € range(f) if and only if bit 2n + 1 of dzy42 is 1. o 


A Specker sequence is a monotone increasing, bounded sequence of rationals 
whose limit is not a computable real number. These sequences have a long history in 
logic, and were first studied by Specker [299] in 1949. In the previous proof, if f is a 
computable function with a noncomputable range, the sequence that is constructed 
is a Specker sequence. 

Another elementary result in real analysis shows that every bounded sequence 
in R has a supremum. 


Theorem 10.2.4. The following are equivalent over RCAo. 


I, ACAg. 

2. Every bounded sequence of real numbers has a least upper bound. 

3. Every bounded, monotone increasing sequence of real numbers converges. 

4. Every bounded, monotone increasing sequence of rational numbers converges. 
5. The restriction of (2), (3), or (4) to sequences in [0, 1]. 


Proof. The implication from ACApo to (2) is similar to Lemma 10.2.2, and is left to 
Exercise 10.9.10. The implication from (2) to (3) is straightforward: given a least 
upper bound x for a bounded, monotone increasing sequence (a;), RCAg proves that 
(a;) converges to x. The implication from (3) to (4) is trivial. 

RCApo also proves that every bounded, monotone sequence is Cauchy, which shows 
that (4) implies ACAg via Theorem 10.2.3. Additionally, the reversal in that proof 
constructs a monotone sequence of rationals, providing a direct reversal. oO 
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10.3 Sets and continuous functions 


Individual real numbers (or points, more generally) are rarely the focus of analysis. 
Sets of real numbers, and continuous functions, are of more interest. As we have 
mentioned, this leads to a key challenge in studying continuous mathematics within 
second order arithmetic. 

We cannot hope to code arbitrary sets of points, or arbitrary functions, using 
only second order objects. Each real number is represented as a Cauchy sequence of 
rationals, and by unwinding the codings a real number is ultimately represented as a 
set of natural numbers. A simple cardinality argument shows we cannot code every 
set of real numbers as a single set of naturals. We can code the most important sets, 
however. There are straightforward codings for open and closed sets, and even Borel 
and some projective sets can be coded by more complicated methods that we will 
discuss in Section 12.2. Similarly, continuous functions between complete separable 
metric spaces can be coded as sets of naturals, even though arbitrary functions 
cannot. 


10.3.1 Sets of points 


To code open sets, we rely on the fact that the open balls of rational radius, centered 
on points from a countable dense set, form a basis for any complete separable metric 
space. We use the standard notation for open and closed balls. If Aisa complete 
separable metric space, x € A andr e€ Q*, we will use the notation B, (x) to refer to 
the open ball {w € A: d(w,x) < r}. The closed ball B,(x) is {w € A: d(w,x) <r}. 
Of course, neither of these sets can be directly represented in second order arithmetic, 
but the relations w € B,(x) and w € B,(x) can be represented by pa and m1 formulas, 
respectively. 


Definition 10.3.1 (Open and closed sets). Let A bea complete separable metric 
space. 


¢ An open set is coded by a sequence (a(i), r(i)) in A x Q*. A sequence of that 
sort codes the set J; B, i) (a(i)). 

* A closed set of real numbers is represented with a code for the complementary 
open set. 


This definition allows us to perform basic set algebra on open and closed sets, 
effectively on their codes. The definition for closed sets may seem unexpected. One 
advantage is that it is (trivially) straightforward to convert the code for an open set to 
a code for its complement, and vice versa. We have already seen that closed sets in 2° 
and w® can be represented as the set of paths through a tree, which is often a more 
useful coding. We will see a third method for coding closed sets in Section 10.5.1. 
We will see the specific coding we have available for a closed set can be a key tension 
in a reversal. 
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Lemma 10.3.2. Let A be a complete separable metric space. The following hold. 


1. The union and intersection of two open sets U and V are uniformly computable 
from the codes for U and V. The union of a sequence of open sets is uniformly 
computable from the sequence. 

2. The union and intersection of two closed sets U and V are uniformly computable 
from the codes for U and V. The intersection of a sequence of closed sets is 
uniformly computable from the sequence. 

3. The relation that a real is inside an open set is 9, and the relation that a real 
is in a closed set is mi. 


This lemma leads to a corollary that, in RCAo, we may form a code for the union 
or intersection of two coded open sets, or for the union of a sequence of coded open 
sets. Similarly, we can form a code for the union or intersection of two coded closed 
sets, or the intersection of a sequence of coded closed sets. 


10.3.2 Continuous functions 


The last basic coding needed for real analysis is a way to represent continuous 
functions from one complete separable metric space to another. We have seen a 
simple definition already, in Definition 3.5.5, that is suitable for some questions 
about specific continuous functions, but has limitations. For example, there is no 
immediate way to compose two functions coded in the style of that definition. Thus 
an alternate definition is used more often in the reverse mathematics literature. This 
definition is somewhat technical and benefits from motivation. 

Although the “s—6” definition of continuity is typically expressed in terms of strict 
inequalities (i.e., open balls), it can be stated equivalently with non-strict inequalities 
(i.e., closed balls). Thus a function f: R — R is continuous if, for all x € R, for 
every closed ball B,.(f (x)) there is a closed ball Bs (x) with f(Bs5(x)) C Be (f(x)). 
We will be interested in closed balls because, when we approximate a point x as a 
limit of a sequence of points (a;), and each a; is in a particular open ball B, the most 
we know is that x is in the closure of B. 

To represent the function f in second order arithmetic, we want to capture infor- 
mation about which closed balls map into which other closed balls without referring 
directly to points in the space. A code for a f will be a sequence F that enumerates 
certain facts of the form f(B,(a)) € Bs(b), fora € A, b € B, andr,s € Q*. The 
inclusion of a four-tuple (a, r, b, s) in F shows that this set inclusion holds. 

The code F does not need to include every such inclusion, however. It only needs 
to include enough information about f for us to recover a value for f(x) for each 
x € A. The domain of F will be the set of points of A for which the code does 
have enough information to recover the value of f. Moreover, we only need F to 
enumerate a sufficient set of facts about f. Thus F will really be a set of five-tuples 
(i, dj, 1;,D;, 5;), but we frequently simplify notation by not writing i. When there 
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is no chance of confusion, we will identify the code F with the function it codes, 
writing F(x) and f(x) interchangeably. 


Definition 10.3.3 (Continuous function codes). Let A and Bbe complete separable 
metric spaces. A code for a continuous function from Ato Bisa sequence F of 
four-tuples (a,r,b,s) with a € A, b € B, andr,s € @ satisfying the following 
properties. 


1. If (a,r,b, syand (a,r,b’,s’) are enumerated in F, then dg(b,b’) < s+’. 
Intuitively, this means that if f(B,(a)) C Bs(b) M By-(b’) then B,(b) and 
By (b’) must overlap. 

2. If (a,r,b,s) € F and d(a’,a) +r’ <r then (a’,r’,b,s) € F. Intuitively, if 
B,-(a’) is a subset of B,(a) in the strong sense that d(a,a’) +r’ < r, and 
f((B;(a))) © Bs(b) then f((B;-(a’))) © Bs(b) as well. 

3. If (a,r,b,s) € F and d(b, b’) +s < s’ then (a,r, b’, s’) € F. This condition is 
analogous to (2) but for closed balls that contain B,(b) in a strong sense. 


A point x € Ais in the domain of F if for every t € Qt there is a tuple (a,r, s, b) 
enumerated in F with s < t and d(x,a) <r. The set F is a continuous function code 
from A to B if the domain of F is A. As mentioned, we will often identify F with 
the function it codes. 


The next result shows that RCApo is strong enough to evaluate a coded continuous 
function at each point of its domain. 


Proposition 10.3.4. The following is provable in RCAo. If F is a coded continuous 
function code from A to B, and x is in the domain of F, there is a point y = f(x) € B 
with d(y, b) < s whenever (a,r,b, s) is enumerated in F and d(a,x) < s. Moreover, 
the point y is unique up to equality of points in B. 


RCApo is strong enough to produce codes for the functions we commonly encounter. 


Example 10.3.5. The following are provable in RCAo. 


1. There are codes for the addition, subtraction, multiplication, and division of 
functions on R (where they are defined). 

2. There is a code for each polynomial function on R. 

3. For each m there are codes for the functions f(x) = 77", x; and g(x) = []j2) 4. 


Example 10.3.6. The following are provable in RCAo. Let A, B, and C be complete 
separable metric spaces. 


1. The identity function f(x) =x has a code. 

2. The metric function d: A x A > Rhas a code. - 

3. For each € B, there is a code for the constant function f: A > B given by 
f(x) = 

4. If f: i — Band g: B = C are coded continuous functions then there is a code 
for the composition go f. 
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The final lemma of this section shows that the definition of a continuous function 
code is effective in a certain sense. Recall that, for y,z € R, the relation z > y can 
be expressed as a x formula. If F is a continuous function code from a complete 


separable metric space A to R, this gives a naively At method to express the relation 
F(x) > y: we can write (4z)[z = F(x) Az > y] or (Vz)[z = F(x) > z > yI. 
The next lemma shows we can leverage the coding to express F(x) > y with no set 
quantifiers. The proof is Exercise 10.9.9. 


Lemma 10.3.7. Suppose F is a continuous function code from a complete separable 
metric space A to R, and y € R. The relation P(x, y) = F(x) > y is given by a x? 
formula with a parameter for F. 


The preceding lemma and examples form part of the justification for representing 
continuous functions with the particular codes we have defined. Another justification 
is experience with many results in reverse mathematics, where this method has been 
able to facilitate meaningful results. Nevertheless, the optimality of this coding 
method is a question of genuine interest. In Section 12.4, we will see results in 
higher order reverse mathematics that help illuminate aspects of the question. 


10.4 The intermediate value theorem 


In this section, we study the reverse mathematics of the Intermediate Value Theorem 
from the perspectives of second order arithmetic and computable reducibility, ex- 
tending the results from Chapter 3. The intermediate value theorem is a particularly 
interesting result because it is computably true, but not uniformly computably true. 
It is thus a useful example of how considering several reducibility notions can help 
us understand a theorem more deeply. 

The first result we consider is a formalized version of Proposition 3.7.3. We 
include a detailed proof to demonstrate the additional verification needed to ensure 
the proof carries through in RCAg. 


Theorem 10.4.1. The following is provable in RCAo. If F is a coded continuous 
function from an interval [a, b| to R, and y is between f(a) and f(b), there is an 
x € [a, b] with F(x) = y. 


Proof. We work in RCAop. We assume without loss of generality that y = 0 and 
f(a) < 0. If there is a g € QN [a,b] with F(qg) = O then we may simply let 
x = q. Suppose not, so F(q) > 0 or F(q) < 0 for all g € Q* /N [a, b]. Because the 
relation P(q) = F(q) > 0 and N(q) = F(q) < O are each =°, we can form the set 
S={q€QN [a, b] : F(q) < 0} with AY comprehension. 

We now mimic the usual proof via subdivision. By induction, using S as a 
parameter, there is a sequence (a;, b;) such at ag = a, bo = b, and 


((an + bn)/2, bn) when F((dn + bn)/2) < 0, 


(An+1, Dnt) = 
(an, (An + by)/2) when F((ay, + byn)/2) > 0. 
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It follows by induction that F(a,) < 0 and F(b,) > 0 for all n. Because |b, — ay| = 
2 for all n, it is straightforward to verify that x = (a,) is a quickly converging 
Cauchy sequence. We claim that F(x) = 0. 

Suppose not. If F(x) = z > 0 then, because x is in the domain of F, there must 
be some (p,r,g, 5) enumerated in F with s < z/2 and d(x, p) < r. We may thus 
choose n large enough that ay,b, € [p —r, p +r]. This means that F(a,) > z/2, 
contradicting the fact that F(a,) < 0. If F(x) < 0, a similar argument shows that 
F(b,) < 0 for some n, which is again a contradiction. oO 


As we discussed in Chapter 3, the proof begins with a division into cases, depend- 
ing on whether there is ag € Q/N [a, b] with F(q) = 0. Because RCAg includes the 
law of the excluded middle, there is no obstacle to making such ineffective choices 
during a proof. But the choice means that this proof does not yield an algorithm to 
compute the desired point x. In fact, there is no such algorithm, as we saw in Propo- 
sition 3.7.4, which stated that the problem IVT does not uniformly admit computable 
solutions. 

Theorem 10.4.1 and Proposition 3.7.4 show that an analysis in second order 
arithmetic is too coarse to fully explore the intermediate value theorem. We have seen 
this phenomenon already with combinatorial results. IVT shows the phenomenon is 
not limited to that area. As with combinatorics, moving to finer reducibility notions 
allows us to understand IVT better. 

A Weihrauch-style analysis of the intermediate value theorem was performed by 
Brattka and Gherardi [19]. They show that the following problem B is equivalent 
to C; (the principle of choice for closed intervals), and provide much additional 
information on this and other choice principles for closed sets. 


Definition 10.4.2. The Weihrauch principle B has as its instances all sequences of 
pairs of rational numbers (a, by, : n € w)) in [0,1] such that ay < dns) < Dna < 
b, for all n. A solution of an instance (ay, b,) is areal x witha, <x < by, forall n. 


The principle B is implicit in some informal proofs of IVT. The next theorem 
makes this implicit appearance into a formal result on computable reducibility. 


Theorem 10.4.3. IVT =,w B. 


Proof. We first show that IVT <,w B. Let f be an instance of IVT. We build an 
instance (a,,b,) of B inductively. Let a9 = 0 and bo = 1. At stage k + 1, we 
perform an effective search, for k time steps, looking for the pair of rationals p,q 
with ax < p < q < by and f(p)- f(g) < 0 that minimizes the value of |p — q|. 
This is a finite search because we limit the number of steps of computation that can 
be performed. If we find such a pair, we let ax4) = p and by4; = q. Otherwise, we 
let dys) = az and bys, = bx. It is immediate that (a,,b,) is an instance of IVT. 
Moreover, we must have lim |b; — a,| = 0. Otherwise, given that the number of time 
steps available in the construction becomes arbitrarily large, we would eventually 
find a pair of rationals p,q that bracket a root of f with |g — p| < lim|bz — axl, a 
contradiction to the construction. The (unique) solution x to this instance of IVT is 
then a root of the function f. 
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Next, we show that B <,w IVT. For each rational q, define the functions 


and 


Thus P(q) and N(q) are computable functions that are strictly positive on [0, g) 
and (q, 0], respectively. Given an instance (ay, b,,) of B, define a function f(x) with 
f(0) =1, fC) = -1, and otherwise 


f(x) = > 2" Pany(x) | - » 2" Non) (x) | - 


{n:a(n)<x} {n:b(n) >} 


Then f(x) is acomputable function and f~!(0) is exactly the set of solutions to the 
instance of B. Oo 


We have now seen two results characterizing the strength of the intermediate 
value theorem. Overall, the intermediate value theorem has these properties: 


1. Every computable instance has a computable solution. 

2. RCAo proves that every instance of IVT has a solution. This shows that not only 
can the theorem be proved without noncomputable set existence principles, it 
can also be proved with weak induction axioms. 

3. IVT is equivalent to B. 

IVT does not admit uniformly computable solutions. 

5. Uniformly computable solutions are admitted for the special case of IVT in which 
the function has only one root, or in which the set of roots in nowhere dense 
(Exercise 10.9.14). 


- 


This combination of properties is a remarkable illustration of how the combina- 
tion of different reducibilities can lead to a detailed understanding of a theorem. The 
intermediate value theorem has also been studied in constructive reverse mathemat- 
ics; see Berger, Ishihara, Kihara, and Nemoto [14] for another perspective on the 
theorem. 


10.5 Closed sets and compactness 


In this section, we consider results about closed sets in complete separable metric 
spaces. The real line, unit interval, Cantor space 2, and Baire Space w® are key 
examples. The theorem we study are among the most basic in analysis, including 


374 10 Analysis and topology 


results on separability and compactness. Yet many of these results are not computably 
true, and some even rise to the strength of TI; comprehension. 

There are many ways to characterize compactness of complete separable metric 
spaces, including total boundedness, the open cover property, and every sequence 
having a convergent subsequence. In order to study the latter two properties within 
RCAo, we use a version of total boundedness to define compactness. The following 
definition can be readily formalized in RCAo. 


Definition 10.5.1 (Compact metric spaces). A complete separable metric space 
A is compact if there is an infinite sequence (A; : i € N) of finite sequences of 
points, Aj = (d;,0,..., di,n;), So that for each z € A and each i there is a J <n; with 
d(z, i,j) < pige 


RCAg proves that many common spaces are compact, including [0, 1], the Cantor 
space 2”, finite products [0,1], and the Hilbert space [0,1]. We will see that 
stronger systems are necessary to prove that spaces like these are compact in other 
senses. This difference in the strength of different definitions of compactness is the 
motivation the for our choice of a definition in RCAg. 

A standard result shows that every sequence in a compact space has a convergent 
subsequence. The next theorem shows this result is equivalent to ACAg over RCAg. 
The underlying computable counterexample is again a Specker sequence. 


Theorem 10.5.2. The following are equivalent over RCAo. 


I. ACAo. 
2. Each sequence in a compact metric space has a convergent subsequence. 
3. Each sequence in [0,1] has a convergent subsequence. 


Proof. We first prove (2) in ACAg. Let Abea compact metric space, witnessed by 
sequences (A;) and (n;). Let (xz) be a sequence of points in A. 

Using arithmetical comprehension, we may form the set S of all sequences 
(jo, --->Jn) such that there are infinitely many k with 


XK € B(a0,;(0)51) A+++ N Blan jn), 2"). 


Using the pigeonhole principle in ACAg, there is at least one sequence of length 1 in S, 
so S is nonempty. Moreover, if (jo,..., jn) € S then, by the pigeonhole principle 
again, there is at least one jn+) with (jo,..-, Jn, Jn+1) € S. Hence S is a nonempty 
tree with no dead ends. 

Let f be an infinite path in S. Define a sequence k(m) inductively as follows: 
k(0) is the least k with x; € B(ay 0), 1), and k(m + 1) is the least k > k(m) with 


xk € B(ao,;(0), 1) V+++ 9 Bama, f (mel) 2"). 


Then (xx(m) : m € N) is a convergent subsequence of (x;), as desired. The special 
case (3) follows immediately. 

For the reversal, assume every sequence in [0, 1] has a convergent subsequence. 
RCAvo proves that [0, 1] is a compact space. Thus, in particular, each Cauchy sequence 
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in [0, 1] must have a convergent subsequence, which RCAg can verify is the limit of 
the Cauchy sequence. This yields ACAg by Theorem 10.2.3. oO 


We have seen several examples where the strength of results about [0, 1] and 
2N are the same. Interestingly, the sequential compactness of 2" is far weaker 
than sequential compactness of the unit interval. By formalizing Proposition 4.1.7, 
we see that sequential compactness of 2% is equivalent to COH. The next lemma 
simultaneously generalizes and formalizes Proposition 3.8.1. 


Lemma 10.5.3. WKLo proves that each open cover of a compact metric space has a 
finite subcover. That is, if (U;) is a sequence of open sets in a compact metric space 
A that covers the space, then there is an N such that A = Uj<n Ui. 


Proof. Let Abea compact metric space, witnessed by sequences (A;) and (7;) as 
in Definition 10.5.1. Let (U;) be a sequence of open sets that covers A. Without loss 
of generality, we may assume each U; is an open ball B(x;,1r;). 

We construct a i, tree T C 2<N. Intuitively, we build the tree so that any 
infinite path would be a Cauchy sequence, consisting of points of the form a;,;, toa 
hypothetical point not covered by (U;). Under the assumption that (U;) does cover 
A, there is no such point, so the tree will be finite, which will allow us to extract a 
finite subcovering. _ 

For each sequence t € T, each element r(7) will refer to a point t; = a;,7(;) € A. 
We put a sequence T into T if the following conditions are met. 


1. t(i) <n; fori < |t|. 
2. For alli, j < |r|, d(t;,t;) <2" +27. 
3. For all i,j < lvl, d(tj,aj;) +27 S> rj. 


Item (1) ensures that the notation ¢; is well defined. Item (2) ensures that, if f is 
an infinite path in T, the corresponding sequence (t; : i € N) provides a quickly 
converging Cauchy sequence with some limit x € A. Item (3) ensures that, if f is 
an infinite path in 7, then for all j we have x ¢ B(a;,1r;) (taking the limit as i > co 
in (3) yields d(x¢,a;) > rj). 

Item (1) is A°, and the remaining two items are each IT, by Lemma 10.1.3. Thus 
the tree T is a m1 tree and, by formalizing Exercise 2.9.13, RCAo proves there is a 
tree T’ > T with the same infinite paths as T. Since T and thus 7’ have no infinite 
paths, by bounded K6nig’s lemma the tree T’ must be finite. Hence there is an N 
such that T’, and hence T, has no sequence of length N. 

We claim that (B(x;,7r;) : i < N) covers Aj. If not, let x be a point that is not 
covered. By by induction we can construct a sequence t of length N + 1 so that 
d(t;,x) < 27! fori < |r|. Then t € T, which is a contradiction. im 


Examining the forward implication in the proof yields the following corollary. 


Corollary 10.5.4. The following is provable in WKLo. If (An) is a sequence of 
compact metric spaces and (Uy) is a sequence such that each (U;,n : n € N) is an 
open cover of Ai, then there is a sequence (N;) such that (Uj,1,...,Ui,Nn,) is a cover 
of Ai for each i. 
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The next theorem, a version of the Heine—Borel theorem, was also among the 
original examples provided by Friedman. 


Theorem 10.5.5. The following are equivalent over RCAo. 


1. WKLo. 

2. For every compact metric space A and sequence (U;) of open sets that covers A, 
there is ann such that (U; : i < n) covers A. 

3. For every sequence (V;) of open sets that covers [0,1], there is an n such that 
(V; : i <n) is also a covering of [0,1]. 


Proof. The proof that WKLo implies (2) is Lemma 10.5.3, and the implication from 
(2) to (3) is immediate. Therefore, we work in RCApo and prove that (3) implies WKLo. 

Let T ¢ 2<N be a tree with no infinite paths. Our goal is to prove that T is finite. 
The construction in Lemma 3.8.2 can be formalized directly in RCAg, Let U be 
enumeration of open intervals corresponding to the tree 7, along with the auxiliary 
sequences (/J,) and (J;). 

By the assumption that T has no infinite path, U covers the unit interval. Therefore, 
by (3), there is a finite initial segment U’ of the enumeration that covers the interval. 
By construction, C is disjoint from J), /;, so C must be covered by the finite 
collection of intervals of the form J; in U’. Thus the height of T is bounded by the 
length of the largest t for which J; is in U’, so T is finite. Oo 


10.5.1 Separably closed sets 


An open set U in a complete separable metric space is coded as a sequence of open 
balls. A closed set is coded with the sequence of open balls in its complement. Closed 
sets are thus coded with “negative information’: information on which points are not 
in the closed set. We may want to convert this to “positive information” about which 
points are in the set. 

Because most closed sets are not the countable union of open balls or the countable 
union of closed balls, we require a different way to represent this positive information. 
One possibility is to enumerate a dense subset S of the closed set. Thus a point y 
is in the closed set if and if there are points in S arbitrarily close to y, which is an 
arithmetical property of S and y. 


Definition 10.5.6 (Separably closed sets). A set F C X is separably closed if there 
is a sequence (x,,) of points in C such that for every y € C and every r € Q* there 
is ani with d(x;, y) < r. We treat the sequence (x,,) as a code for the set F. 


With two representations of closed sets, a natural question is the difficulty of 
converting between the representations. The next theorem shows that converting 
between positive and negative representations of closed sets is nontrivial even in the 
particular case of [0, 1]. 
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Theorem 10.5.7 (Brown [22]; see [157, 11]). The following are equivalent over 
RCAo. 


I, ACAo. 

2. In a compact metric space, every nonempty closed set is separably closed. 
3. In [0, 1], every nonempty closed set is separably closed. 

4. Ina complete separable metric space, every separably closed set is closed. 
5. In [0,1], every separably closed set is closed. 


Proof. Exercises 10.9.11 and 10.9.12 ask for proofs of (2) and (4) in ACAo. Part (3) 
follows immediately from (2) in RCAg, and (5) follows from (4). We will prove that 
each of (3) and (5) implies ACAg over RCAo. 

First, we work in RCAp and assume every nonempty closed set in [0, 1] is sepa- 
rably closed. In light of Theorem 10.2.4, it is enough to prove that every bounded, 
increasing sequence of rationals in [0, 1] converges. Let (a; : i € M) be a monotone 
increasing sequence of rationals in [0,1]. We may assume that there is no rational 
number that is the limit of (a;). 

For each i, let U; be the interval [0,a;), which is open in [0,1]. Let C be the 
closed set with complement ); U;. We have 1 € C, so C is nonempty. Under our 
assumption, there is a sequence S = (s;) so that S = C. 

Now, for each rational qg € [0, 1], either g < a; for some i, or q > b; for some i. 
Otherwise, because g > a; for alli, then g € C by construction, but g < s; for all i, 
so g = minC, so lima, = q, contradicting the assumption that a, does not converge 
to a rational number. This means that, for a rational number gq, the properties g « U 
and g € C are zt and every rational in [0, 1] is in U UC. Working in RCAg, we may 
form the set L of all rationals g in [0,1] such that g < a; for some i, that is, the set 
of rationals in U. 

We now apply an interval-halving technique that is also seen in the proof of the 
intermediate value theorem in RCAg (Theorem 10.4.1). We will produce sequences 
(cn) and (d,,) of rationals with cy < Cys) < dns, < dy, and |cy, — d,| < 27”, so that 
each c, is in U and each d,, is in C. At stage 0, let co = 0 and dp = 1. At stage k + 1, 
assume we have constructed rationals cx, and dx. Let z = (cx + dx) /2, which will be 
rational. If z € L we let cx4; = z and dx4,; = dz. Otherwise we let cx4) = cx and 
dx+1 = z. In either case, cx4; and dx4; have the desired properties. By = induction, 
the sequences (c,,) and (d,,) exist. 

By construction, (c,) and (d,) are both quickly converging Cauchy sequences 
with the same limit x. It is impossible for x € U, because then x < a; for some /, 
and hence limd, # x. It is also impossible to have b; < x for any i, because then 
lim c, # x. Hence x = min C, and lim a; = x as desired. 

For the second reversal, we work in RCAo and assume that every separably closed 
set is closed. To establish ACAg, it is enough to show that the range of an arbitrary 
injection f: N > N exists. Let § enumerate {2-/”) : m € N} U {0}. The set of 
points enumerated in S is closed, but in any case we may view S as a dense subset 
of a closed set C. By assumption, there is a code for the open complement U of C. 
Now we have that an arbitrary y is in the range of f if and only 2” € S, and also if 
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and only if there is no ball B(a,r) in U with 2” € B(a,r). We may thus form the 
range of f via AY comprehension. oO 


In Theorem 10.5.7 (2), there is an additional assumption that the space is compact. 
For non-compact spaces, the next theorem shows that TI}-CAo is needed to find a 
dense subset of an arbitrary closed set. The fundamental idea in the proof is that for 
each xi formula y(n) there is a sequence of trees J, in w® such that, for each n, 
y(n) holds if and only if there is a path in 7,,. At the same time, the set of paths 
through a tree in w® is a closed set in that space. Hence locating points in closed 
sets of w® can allow us to answer Da questions. 


Theorem 10.5.8. The following are equivalent over RCAo. 


1, TI} -CAo. 

2. In a complete separable metric space, each nonempty closed set is separably 
closed. 

3. In NN, every nonempty closed set is separably closed. 


Proof. We first work in TI;-CAo and let A be a complete separable metric space. Let 


C be a nonempty closed set in A. We may use II; -CAg to form the set S of pairs (a, r) 
such that CN B(a,r) is nonempty. We will show in Corollary 12.1.14 that ATRo, and 
thus TI;-CAo, proves the xt axiom of choice, which is the following scheme over a 
formulas w that may have parameters: 


(Vk) (AX) b(n, X) > (An) (Va) (n, Yn). 
We have shown the following formula holds, in which the matrix is arithmetical: 
(V(a,r))(Ax)[(a,r) €S 3x € AAx ¢ B(a,r) 


Thus, using xt choice, we may form a sequence (y,,) of points such that, whenever 
n= (a,ry and CN B(a,r) # @, we have y, € CM B(a,r). It is immediate that we 
can form a dense sequence in C from the sequence (y,,). 

The implication from (2) to (3) is immediate. To prove the reversal that (3) 
implies TI; comprehension, we first show in RCAg that (3) implies ACAg (see Ex- 
ercise 10.9.15). Thus, working in ACAo, it is enough to show that (3) implies pa 
comprehension. 

Let w(n) bea xt formula with parameters. We want to form the set {n : w(n)}. 
We first apply Kleene’s normal form theorem (Theorem 5.8.2) to find a formula 6 
such that 

(¥n)[ p(n) @ (Af €N®)(¥m)0(n, m, flm))]. 


We may form a sequence of trees 7,, such that the set of paths through 7, is exactly 
the set of f such that 6(n,m, f[m]) holds for all m. We may then combine the trees 
(T,,) to form a single tree T so that a sequence o is in T if and only if o is of the 
form (i) ~ t where T € T;. 

The set of paths through T is a closed set. As usual, the corresponding open set 
is coded by the set of basic open balls corresponding to all elements of N<“ not 
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in T. Apply (3) to find a sequence (g,) dense in 7. Then, for each n, there is some 
f € [T,] if and only if there is an gx, with g,[0] =n, and this happens if and only 
if y(n). This allows us to form the set {n : y(n)} with arithmetical comprehension 
relative to the sequence (gx). Oo 


10.5.2 Uniform continuity and boundedness 


In some foundational programs, particularly in constructive analysis, an assumption 
is made that each continuous function on the real line is accompanied by a modulus of 
continuity on each closed interval. The lack of that assumption in reverse mathematics 
is a key source of separation between the programs. For example, there are theorems 
that are viewed a constructive in Bishop’s program, when expressed in the style of 
that program, but which are not provable when expressed in RCApg in the style of 
reverse mathematics. In this section, we survey results on the reverse mathematics 
of uniform continuity. 


Definition 10.5.9 (Modulus of uniform continuity). 

Let f be a function from a complete separable metric space A toa complete 
separable metric space B. A modulus of uniform continuity for f is a function 
h: Q* — Q@ such that, for all x,y € A and all r € Q, if d(x, y) < A(r) then 
d(f (x). f(y) <r. 


If we let D = {2-" : n € N}, there is no loss of generality in assuming / is a 
function from D to D instead of Q* to Q*. 

While the moduli of continuity for polynomial functions or other common func- 
tions can be directly constructed, WKLo is needed in general to obtain a modulus of 
uniform continuity for a continuous function on a compact space. Exercise 10.9.17 
asks for a proof of the following. 


Lemma 10.5.10 (Brown; see Simpson [288, IV.2.2]). The following is provable in 
WKLo. Suppose Aisa compact metric space, C © A is closed, and f is a continuous 
function from C to a complete separable metric space B. Then f has a modulus of 
uniform continuity on C. 


The proof of the next theorem demonstrates a somewhat peculiar method: when 
WKLo fails, there is an infinite tree with no infinite path. Trees with this property are 
difficult to visualize, but can be useful to construct counterexamples. 


Theorem 10.5.11 (Simpson [288, IV.2.3]). The following are equivalent over 
RCAo. 


1. WKLo. 

2. If F is a coded continuous function from an interval [a, b] to R then F achieves 
a maximum value at some point in the interval. 

3. If F is a coded continuous function from an interval [a,b] to R then F is 
bounded. 
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Proof. The implication from (1) follows in WKLo from Lemma 10.5.10, because a 
modulus of continuity immediately provides a bound on the function. The implication 
from (1) to (2) is straightforward. To show (2) implies WKLo, we work in RCAg and 
assume WKLo fails, which means there is an infinite tree T with no path. We want to 
show that (3) fails. 

Apply the construction from Lemma 3.8.2 to T, letting C be the Cantor set and 
letting (J;) be the corresponding sequence of intervals. Define a set reo by 
putting tT € T if and only if t ¢ T but every prefix of rt is in 7. Thus the intervals in 
{Jr iT € T} are pairwise disjoint and cover C. Note that, in the lexicographic order, 
each element t € T has an immediate successor and immediate predecessor within 
T which can be found effectively from T. 

For each x € [0,1], one of two cases holds. Case 1: x is in the closure of an 
interval J, for a unique 7 € T. Case 2: x is strictly between two intervals J, and 
J;, With o,T € T so that t is the immediate successor of o within T. 

We may thus define a continuous function f: [0,1] — R* as follows. In case 1, 
for x € Ja, let f(x) = |o|. For case 2, for x between J, and J;, interpolate linearly 
between the values on J, and J;. Because T is infinite, this function is unbounded.o 


We leave the proof of the following to Exercise 10.9.17. 


Theorem 10.5.12 (Simpson [288, IV.2.3]). The following are equivalent over 
RCAo. 


1. WKLo. 

2. Every continuous function on a compact complete separable metric space has a 
modulus of uniform continuity. 

3. Every continuous function on the unit interval has a modulus of uniform conti- 


nuity. 


10.6 Topological dynamics and ergodic theory 


In this section, we consider several results from topological dynamics. These results 
move us beyond the elementary analysis of the previous sections. They are also of 
interest for demonstrating a principle that is strictly between WKLo and ACAo, and 
highlighting the open question of the strength of the iterated version of Hindman’s 
Theorem. 

We then move to measure theory and ergodic theory. Some basic aspects of 
measure theory are closely related to WWKLo (Section 9.11) and to algorithmic 
randomness. We also state a result of Avigad and Simic on the mean ergodic theorem. 

In topological dynamics, a compact dynamical system consists of a compact 
metric space X and a continuous function 7: X > X. The orbit of a point z is the 
set {7*(z) : k > O}. In very simple systems, the orbit of a point z may be periodic, 
with T(z) = z for some n. More complex systems will not have periodic orbits, but 
they will have orbits with other recurrence properties, as in the following definition. 
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Definition 10.6.1 (Almost periodic and proximal points). 


1. A point x of a compact dynamical system is recurrent if there is a sequence (n;) 
such that lim Tx = x. 

2. A point x is almost periodic (also uniformly recurrent) if, for every r € Q* there 
is an N, such that, for every m, there is a k < N, with T’** (x) € B,(x). 

3. Two points x and y are proximal if, for every r € Q*, there are infinitely many n 
with d(T"x,T"y) <r. 


Birkhoff’s recurrence theorem shows that every compact dynamic system has a 
recurrent point. A stronger theorem shows that every such system has an almost 
periodic point. We begin this section with an examination of these two results, based 
on work of Day. 

We then study the Auslander—Ellis theorem, which states the every point in a 
compact dynamical system is proximal to an almost periodic point. This theorem 
is closely related to an iterated version of Hindman’s theorem and to several other 
combinatorial results. In the second subsection, we discuss the known equivalences 
and state several key open problems. 


10.6.1 Birkhoff’s recurrence theorem 


In this section, we focuses on dynamical systems over closed subsets of Cantor space. 
For such spaces, we can use a particularly concrete representation of the continuous 
transformation of the dynamical system. Exercise 10.9.18 shows that, in WKLo, every 
coded continuous function on Cantors space is encoded as in the next definition. 


Definition 10.6.2 (Coded transformations on 2", Day [63]). A function f: 2<“ > 
2<N encodes a transformation of 2" if f is total and order preserving, and for every 
1 there is an m such that | f(o)| > / for all o € {0,1}. 


Definition 10.6.3 (Compact dynamical systems on Cantor space, Day [63]). A 
compact dynamical system over the Cantor space consists of: 


1. a tree C C 2<" coding a nonempty closed subset [C] of 2", and 
2. a transformation F: 2N — 2 encoded by a map f: 2<“ > 2<N, 


We require that f(a) € C for all o € C. This guarantees F(C) € C. 
The next results relate to a form of Birkhoff’s recurrence theorem. 


Lemma 10.6.4 (Day [63]). WKLo proves that every compact dynamical system over 
the Cantor space has a recurrent point. 


Proof. Working in WKLo, let (C, f) be a compact dynamical system over the Cantor 
space. It follows from definitions that a point x in the system is recurrent if, for 
every i, there are n,/ > i such that x [i C f”(x ['). Thus the set of recurrent points 
Ris m1 relative to C and f. Our goal is to construct an infinite tree T so that [JT] C R. 
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To do so, we will inductively construct a sequence (U;) of finite sets of strings, 
viewing each U; as a closed set [U;] ¢ 2". The construction will ensure that if 
x € [U;|] NC thenx fi C f"(x [) for some n, / > i. Hence every x € ()\;[Ui] N[C] 
will be a recurrent point. 

We also need to ensure that [U;] M [C] is nonempty for each i. To do so, the 
construction will make sure that for each 7 there is an s = s; such that [C] ¢ 
Unes f- "“((Ui]). Assuming this condition, if [U;] A [C] = @ then there would be 
some x € [C] andn < s with f"(x) € [U;], a contradiction to the assumption that 
FLE]}) ¢ [Cc]. 

The construction will form a sequence (U;) of finite sets of strings, a sequence 
(s;) of numbers, and an auxiliary sequence (V;) of finite sets of strings. We will 
ensure the following properties hold for each i. 


1. For every t € U;, there is ann with i <n < s; such that t [i < f"(7). 

2. For every o € V; there aren < s; and t € U; such that t < f"(c). 

3. V; codes a finite open cover of [C]. This is a x property: there is a level / such 
that every t € C of length / extends some element of Vj. 


The three conditions together can be written as a x? formula, allowing us to use only 
x? induction in the construction. 

At stage 0, let Up = Vo = () and so = 0. The three conditions are straightforward 
to verify. At stage i + 1, we define infinite sequences 


Uiai(s) = {o € 25%: (Ar € U,)[o < TIA (An[(fi<n<s)A(oti< f"(c))]}, 
Visils] ={o € 25 : (Am < s)(Ar € Uisi(s))[f"(0) = TI}. 


We immediately have [Uj+i(5)] © [Ui], [Uisi(s)] © [Uisi(s + 1], and 
[Vi+1(s)] © [Vi41(s + 1)]. It can be shown in WKLo that [C] © U,[Vi+1(s)] (Exer- 
cise 10.9.19). Hence, by Theorem 10.5.5, there is some s;; such that every sequence 
of length s;,; in C extends some sequence in V;41(5;41). We define Uj4, = Uj+1(si+1) 
and V;+; = Vi+1(sj+1). We may verify the three properties hold. 

It follow from the construction that [U;] M [C] is nonempty for each 7. Hence 
({Ui] N [C] : i € N) is a nested sequence of nonempty closed sets of 2. Hence, 
by Exercise 10.9.6, there is a point in ();[U;] M [C], which is the desired recurrent 
point of (C, f). Oo 


Theorem 10.6.5 (Day [63]). The following are equivalent over RCAo. 


1. WKLo. 
2. Every compact dynamical system over the Cantor space has a recurrent point. 


Proof. In light of the previous lemma, we need to prove that (2) implies WKLo. 
Working in RCAg, let T be an infinite subtree of 2<“. We view T as a subset of 3<“ 
and construct a compact dynamical system over 3“ (modifying the definitions in the 
obvious way). We will define a dynamical system D on 3™ so that every recurrent 
point is a path through 7. Because there is an effective bijection 2<“ to 3<N that 
gives an effective homeomorphism from 2" to 3“, we can then apply (2) to find the 
desired recurrent point in D. 
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As we construct the map f: 3% — 3%, our goal is that if x € [7] then f(x) = x, 
and otherwise x is not recurrent. The additional branching in 3<" gives us space 
to move each point x € 3<N that is not a path in 7. The general method is that the 
orbit of x moves in increasing lexicographical order, trying to find a path in [T], and 
wraps around when it extends (2). 

We let F(()) = (). Given o # (), let n = |o| — 1, so o(n) is the final value of o. 
If o € T then let f(7) = o fn. Otherwise, if o ¢ T, let m be the shortest initial 
segment of o such that 2 ¢ T. Hence, if 7 contains a 2 then this 2 is the final value 
of z. Using that fact, we can define 


(9 ~(1) 70%) fa ifm =p~(0) orm =p ~(0,2), 
f(a) = 4 (p72) 70%) tn ifm =p (1) orm =p ~(1,2), 
oN ifr = (2), 


Then D = (3%, f) is acompact dynamical system. For the rest of this proof, let <jex 
be the lexicographical order on 3“. Each of the following claims is straightforward 
to verify from the construction. 


1. Let n > 0, o0,...,@n be a sequence such that 0, < 00, f(07) = oi+1 for all 
i<n,ando, ¢7. Then (2) < ox, for some k. 
2. Ift eT, |o| < |t| ando <x t then f(7) <ex T. 


Let x bea recurrent point for D. Working towards a contradiction, assume x ¢ [T], 
and choose o < x with o ¢ T. Because x is recurrent, we may choose a sequence 
O0,++-,On With oO, = 0 < oo XS x and f(o;) < oj41. By the first claim, we have 
(2) < ox for some k <n. 

Now choose t € T such that |r| > |o;| for all i < n. Because ox, is a string 
of all Os, and |ox+41| < |t|, we have ox+1 <jtex T. Then, by induction on the second 
claim, we have oy, <jex Tt. Now oy, £ T because oy, ¢ T, and Oy < 0, SO Oo <Iex T. 

By another induction over the second claim, we have 0; <jex T for all 7, and hence 
(2) # o; for all 7, contradicting (2) < ox. Thus x € [T], as desired. Oo 


We now turn to the principle AP which states that every compact topological 
system over Cantor space has an almost periodic point. We state the following 
results without proof. Here, a system (C, F) is minimal if, for every system (D, f) 
with [D] ¢ [C], we have [D] = [C]. 


Lemma 10.6.6 (Day [63]). WKLo proves that every point in a minimal system is 
almost periodic. 


Lemma 10.6.7 (Day [63]). Over WKLo, ACAg is equivalent to the proposition that 
every compact dynamical system on Cantor space contains a minimal subsystem. 


The two lemmas combine to give the following bound on the strength of AP. 


Corollary 10.6.8 (Day [63]). The principle AP is provable in ACA. 
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Building on this, Day showed that AP is strictly between WKLo and ACAp. This 
result requires two theorems, one to show AP is stronger than WKLo and one to show 
ACAp is stronger than AP. Each of the proofs constructs an w-model to give the 
separation. We prove one of the two separations here. In the following results, we 
use a fixed effective enumeration of all mm? classes. We state the next lemma without 
proof. 


Lemma 10.6.9 (Day [63]). Let f be the left shift on Cantor space and let P ¢ 2° 
bea mm class. There is a nm? class C, whose index is uniformly computable from an 
index for P, such that (C, f) is a compact dynamical system on Cantor space and 
one of the following holds. 


I1.COP=@. 
2. There is a nonempty mn class P © P with the property that no element of P is 
an almost period point of (C, f). 


Theorem 10.6.10 (Day [63]). There is an w-model of WKLo + =AP. 


Proof. Let (Q;) be an effective enumeration of all LN classes on Cantor space and 
let f be the left shift map. Let 7. be the projection from a point in Cantor space onto 
its eth coordinate. Using Lemma 10.6.9, and applying pullbacks as needed, we can 
build a single system 


(C,g) =| | (Ce. f) 


ecw 


so that for each e, either 7-(Q-) MN Cz is empty or there is a nonempty nm class O- 


so that no element of z, (QO) is almost periodic. 

We next form a nested sequence (P;) of mm classes. Let ®, be an effective 
enumeration of all Turing functionals. At state 0, let Po be a nonempty m1 class all 
of whose members have PA degree. Now, at stage s + 1, consider ®,. If there is ann 
such that {x € Ps : @* (n) T}, let Ps41 be this set for the least such n. 

If ®, is total on all elements of P,, consider the m1 class Q = ®,(P,). Let e be an 
index for Q. There are two cases. In the first case, if 7e(Q) NC = @, let Ps4, = Ps. 
In this case, it is clear that no element of Ps4; computes an element of C. In the 
second case, there is a nonempty m1? set O € Q such that no element of Te(Q) is 


almost periodic in Cz, and thus no element of 0 is almost periodic in C. In this case, 
let Pou) = {x € Py : OF O}. Again, P,,, will be a nonempty nm class and no 
element of P,,; computes an almost periodic point in (C, g) under ®,. 

Because (P,) is a nested sequence of nonempty closed sets in Cantor space, by 
compactness there is a point x € (), Ps. In particular x € Po, so x has PA degree, and 
by construction x cannot compute an almost periodic point of (C, g). The w-model 
of sets computable from x will thus satisfy WKLo and not AP. oO 


We omit the proofs of the following results, which separate AP from ACAg and 
also provide a conservation result related to AP. 


Theorem 10.6.11 (Day [63]). There is an w-model of AP consisting entirely of low 
sets. In particular, this model satisfies AP + ~ACAo. 
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Corollary 10.6.12 (Day [63]). Over RCAg, the principle AP is strictly between ACAg 
and WKLo. 


Theorem 10.6.13 (Day [63]). AP is conservative over WKLo for I; sentences. 


10.6.2 The Auslander—Ellis theorem and iterated Hindman’s theorem 


We turn now to the Auslander—Ellis theorem. Because this theorem is known to 
imply ACAo, we can often work over ACAg when proving equivalences, making the 
coding systems much easier to manage. 

We begin by summarizing the equivalences between two iterated versions of 
Hindman’s theorem, a principle on ultrafilters, and a version of the Milliken—Taylor 
theorem. Full definitions, and proofs yielding the following theorem, are given by 
Hirst [155], who studied the possibilities of formalizing proofs of Hindman’s theorem 
via ultrafilters into second order arithmetic. 


Theorem 10.6.14. The following are equivalent over RCA. 


J. \HT (= IHT2): For every sequence of 2-colorings (C; : i € N) there is an 
increasing sequence (x; : i € N) of numbers such that for every j € N the set 
{xj 11> j} satisfies Hindman’s Theorem for C; (see [155, Theorem 3]). 

2. IHT <w: For every sequence of finite colorings (C; : i € N) there is an increasing 
sequence (x; € N:i € N) such that for every j € N the set {x; :i > j} satisfies 
Hindman’s Theorem for C; (see [155, Lemma 4]). 

3. AUF: Every countable downward translation algebra has an almost downward 
translation invariant ultrafilter (see [155, p. 2]). 

4. For any n > 3, the statement MT,,: If f: [N]" — k then there is an increasing 
sequence X = (x; € N: 1 € N) such that f is constant on FS,(X) (see [155, 
Lemma 5]). 


Blass, Hirst, and Simpson [17, Theorem 4.13] show that IHT is provable in ACA\, 
and hence so are the equivalent principles above. 

The Auslander—Ellis theorem is the following statement: Let X be a compact 
metric space and 7: X — X acontinuous function. For every x € X there isa y € X 
such that y is almost periodic under T and x is proximal to y under T (See [17, pp. 
147-148] and [120, Theorem 8.7]). 

This theorem is known to be closely related to IHT. Blass, Hirst, and Simpson [17, 
Theorem 5.11] use IHT as a lemma to prove the Auslander-Ellis theorem in ACA}. 
Kreuzer [188, Theorem 11] notes that IHT follows from ACAg and the Auslander— 
Ellis Theorem using ultrafilter techniques. 

We will sketch a slightly different proof in the next theorem, using well-known 
results described by Furstenberg [120], and assuming the reader has access to that 
book. In particular, see [120, Theorems 8.8, 8.10, and 8.11; Remark, p. 163; and 
p. 127]. 
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Theorem 10.6.15. The following are equivalent over ACAo. 


1. \HT. 
2. The Auslander-Ellis theorem. 


Proof. The implication from (1) to (2) in ACAg is given by Blass, Hirst, and Simp- 
son [17, Theorem 5.11]. We work in ACAg and prove the converse. 

Recall that ACAg (actually WKLg) proves that every continuous function on a 
compact metric space has a modulus of uniform continuity. Thus for every compact 
metric space X, continuous map T: X — X,¢e > 0,and p € N wecan find (uniformly 
in € and p)ao > Osuch that d(x, y) < 6 implies d(T?x,T? y) < e forall x,y € X. 

Let (C; : i € N) be a sequence of 2-colorings of N. We write the color sets of C; 
as Cio UC; = N. Let Z = 2'N with each element z € Z regarded as a doubly 
indexed sequence (z;(j) € {0,1} : i, 7 € N). Give Z the product topology, so Z is a 
compact metric space. 

Let T be the function on X such that Tz;(n) = z;(n + 1) for all i and n. This is 
a kind of modified left shift map. ACAp proves that T is continuous. Form a point 
x € X by letting x;(n) be, for each i and n, the unique color assigned to n by C;. 

Apply the Auslander—-Ellis Theorem to obtain an almost periodic point y € Z 
such that x and y are proximal. Let Y be the closure of {T’y : i € N} in Z. By 
proximality, x € Y. ACAo proves that Y is a compact metric space and is able to form 
a code for the restriction of T to Y. 

The remainder of the proof is parallel to the proof of [120, Proposition 8.10] 
(see also [120, Remark, p. 163]). We will construct a strictly increasing sequence 
(p; : i € N) such that for each 7 € N the set {p; : i > j} satisfies Hindman’s 
Theorem for C;. We require the following fact. 


Claim. For any open ball U = B,(y) around y in Y, there is a p € N such that 
T?x € U and Ty € U. Moreover, p may be obtained uniformly arithmetically 
from t. 


To prove the claim, we first define a value r = t/2. Because y is almost periodic, we 
can choose N, such that for every m there is some k < N, with T’’**y € B,(y). 
Moreover, N, is arithmetically definable relative to r and parameters. Next, using 
a modulus of uniform continuity, choose s € Q* such that d(z,w) < s implies 
d(T'z,T'w) < t/2 for alli < N,. 

Because x and y are proximal, we may choose m such that d(T’x,T’y) < s, 
and by construction we may choose k < N, such that d(T’**y, y) < t/2. Then 
d(T™**y, y) < t and d(T”**x, y) < t, so the claim holds with p = m+ k. The 
construction of p can be carried out uniformly arithmetically. This completes the 
proof of the claim. 

We now return to the proof of the theorem. Fix a sequence of sets (V; : i € N) of 
open balls such that y € V; for all i and, moreover, w;(0) = z;(0) for all w, z € V;. 
We construct a sequence U; 2 U2 2 --- of open balls around y, and simultaneously 
build an increasing sequence (p; € N : i € N), by induction. Let U; = Vj. By 
induction, assume U; has been constructed such that y € U;. Use the claim to choose 
Di € Nsuch that T?'x € U; and T?‘y € U;. Let U;4; be an open ball around y that is 
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contained in U; N T~?'U; N Vi+1. This completes the construction of (U;) and (p;). 
Note that U; C U; whenever i < j, by induction on /. 
Now fix k € N and let k < i(0) < i(1) <--- <i(n). We want to show that 


TPiOtPi TF Pi(n) = Vu. 


The proof is by induction on n. By construction, 
T Pi) + Pi) t+ Pi(n) y E TPHO*PLLYE PHD UY, , 
Un 


C TPil0) + Pia) t+ Pi(n-2) Dy 
= i(n-1) 


C++ C Ui) Ve. 


Because V; was chosen to fix zx (0) for all z € Y, we see that {p; : i > k} satisfies 
Hindman’s Theorem for the coloring Cx. oO 


Corollary 10.6.16. The following are equivalent over ACAo. 


1. \HT. 

2. AUF. 

3. MT, for each n > 3. 

4. The Auslander—Ellis Theorem. 


The precise strength of the principles from this corollary is unknown, as is their 
precise relationship to Hindman’s theorem. 


Question 10.6.17. The principles in Corollary 10.6.16 are known to imply ACAo 
and are provable in ACA}. Determine their precise strength, or show the strength is 
strictly between those two systems. 


Question 10.6.18. Does HT imply IHT over RCA, or even over ACAg? 


Kreuzer obtained additional characterizations of the Auslander—Ellis theorem in 
terms of the existence of certain ultrafilters. This adds additional principles to the 
cluster with IHT. 


Theorem 10.6.19 (Kreuzer [188]). The following are equivalent over ACA. 


1. The Auslander—Ellis Theorem (and hence all the additional equivalent results 
from Corollary 10.6.16). 

2. Every countable downward translation algebra has a partial minimal idempotent 
ultrafilter. 

3. Every countable downward translation algebra has a partial idempotent ultra- 


filter. 
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10.6.3 Measure theory and the mean ergodic theorem 


Ergodic theory is another key area of dynamical systems. It relies on measure theory 
and measure preserving transformations. A significant amount of work has been 
done on effective measure theory, especially as it relates to algorithmic randomness. 

We saw in Corollary 9.11.3 that WWKLo is equivalent to the principle that, for 
each X, there is a Y that is 1-random relative to X, which can be restated informally 
as the principle that the complement of a measure zero set is nonempty. WWKLog is 
equivalent to many other statements from measure theory. Avigad and Simic [11] 
provide a thorough summary. 


Theorem 10.6.20 (Yu and Simpson [331]; see Brown, Giusto, and Simpson [23]). 
The following are equivalent over RCAg. 


J. WWKLo. 
2. For any covering of the closed unit interval with a sequence of rational intervals 
(aj, bi), we have >)" |b -i-ai| > 1. 


Compared to compact dynamical systems, less work has been done on the reverse 
mathematics of ergodic theory. Avigad and Simic [11] studied the strength of a 
version of the mean ergodic theorem, and the existence of certain projections. 


Definition 10.6.21 (Partial averages and fixed points). If T is an isometry of a 
Hilbert space H, and x € H, the sequence of partial averages is the sequence 


1 
Sn(x) = (x4 Txt Txt ++ +7" 3), 


The set of fixed points of T is Fix(T) = {x € H: Tx = x}. 


Theorem 10.6.22 (Avigad and Simic [11]). The following are equivalent over RCAo. 


I, ACAo. 

2. For every Hilbert space H, isometry T on H, and point x € H, the sequence of 
partial averages S;(x) converges. 

3. For every Hilbert space H, isometry T on H, and point x € H, the projection of 
x onto Fix(T) exists. 


Moreover, the above equivalence holds if we consider nonexpansive linear operators 
in place of isometries. 


10.7 Additional results in real analysis 


In this section we survey without proof the reverse mathematics of a few theorems 
that have been of particular interest in computable analysis, including fixed point 
theorems, existence theorems for ordinary differential equations, and the Hahn— 
Banach theorem. 
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The first result we mention shows that a form of the Stone—Weierstrass theorem 
is provable in RCAo. For definitions, see Brown [21]. 


Theorem 10.7.1 (Brown [21, Theorem 3.27]). The following is provable in RCAo. 
Let A bea complete separable metric space and let S be an algebra on C(A) that 
separates points in A and contains 1. Let f: A = R be continuous and have a 
modulus of uniform continuity. Then for any € > 0 there is a function g € S such 
that | f (x) — g(x)| < eforallx € A. 


The next result shows that forms of Brouwer’s fixed point theorem and Schauder’s 
fixed point theorem are equivalent to WKLo. 


Theorem 10.7.2 (Shioji and Tanaka [278]; see Simpson [288, Theorems IV.7.7 
and IV.7.9]). The following are equivalent over RCAo. 


1. WKLo. 

2. Every continuous function from a nonempty closed convex set in [-1,1]” to 
itself has a fixed point. 

3. Let C be the convex hull of a nonempty set of finite points in R" for somen € N. 
Then every continuous function C — C has a fixed point. 

4. Every continuous function from the unit square to itself has a fixed point. 


The formalization of Shauder’s fixed point theorem leads to the following char- 
acterization of Peano’s existence theorem of solutions of ODEs. 


Theorem 10.7.3 (Simpson [288, Theorem IV.8.2]). The following are equivalent 
over RCAo. 


1. WKLo. 

2. Suppose f (x, y) is a continuous real-valued function on the rectangle -a < x < 
a,-b < y < b, where a,b > 0. Then the initial value problem dy/dx = f(x, y), 
y(0) =0 has a continuously differentiable solution on the interval -a < x < a, 
where a = min(a.b/M) where 


M =max{|f(x,y)|: -a<x<a,-b<y <b}. 


3. If f(x, y) is continuous and has a modulus of uniform continuity in some neigh- 
borhood of (0,0) then the initial value problem dy/dx = f(x,y), y(O) = 0 has 
a continuously differentiable solution in some interval containing x = 0. 


We end this subsection with a result on a formal version of the Hahn—Banach 
theorem. For definitions, see [288]. 


Theorem 10.7.4 (Simpson [288, Theorem IV.9.3]). The following are equivalent 
over RCA. 


1. WKLo. 

2. Let A be a separable Banach space and let S be a subspace of A. Let f: S— R 
be a bounded linear functional with || f|| < a for some a > 0. Then there exists 
a bounded linear functional f : A > R with ||f|| < @. 
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10.8 Topology, MF spaces, CSC spaces 


While metric spaces are the heart of analysis, non-metric topological spaces also 
appear. There is a natural question of how much non-metric topology can be for- 
malized in second order arithmetic. Of course, we cannot expect to represent every 
topological space: there are 2 pairwise non-homeomorphic topologies on w [193], 
too many to code each one with an element of 2“. Therefore, we have to find a rep- 
resentation of some special class of topological spaces and then study that class. We 
will survey two of these special classes: countable, second countable (CSC) spaces 
introduced by Dorais, and maximal filter (MF) spaces introduced by Mummert. 

Hunter [164], Normann and Sanders [235], and Sanders [269, 270] have also 
studied topology using higher order arithmetic, which has advantages in this setting 
compared to second order arithmetic. For example, higher types make it possible to 
treat a topology on a space directly as a collection of subsets of the space. As with 
much higher order reverse mathematics, however, the link to classical computability 
theory is more tenuous. 

The programs of locale theory and domain theory provide another approach 
towards effective topology. The usual approach in domain theory is more category 
theoretic than proof theoretic. However, Mummert and Stephan [227] show the class 
of second countable MF spaces is the same as the class of second countable domain 
representable spaces, giving a link between these programs. Sanders [268] has also 
studied the reverse mathematics of domain theory. 


10.8.1 Countable, second countable spaces 


Dorais [70, 71] initiated the study of countable, second countable spaces in reverse 
mathematics. These are spaces that have a countable set of points and also a countable 
basis. The countability of the points allows for the spaces to be formalized in second 
order arithmetic. To use systems weaker than ACAo, it will be convenient to use 
enumerated sets rather than ordinary (decidable) sets to represent open sets of points. 


Convention 10.8.1 (Enumerated sets). For the remainder of this section, an enumer- 
ated set V is coded by a function Ey : w > w so thatx eV ox+1 € range(Ey). 
This definition allows the empty set to be enumerated, e.g. by Ax.0, and in gen- 
eral the function Ey need not be injective. As usual, we use ordinary notation for 
set operations on enumerated sets: we write VN W, V U W, etc., to represent the 
corresponding enumerated sets. 


Definition 10.8.2 (Strong and weak bases). The following definitions are made in 
RCAo. A strong basis for a topology on acountable set X € Nis asequence U = (U;) 
of subsets of X and a function k: X x N x N — N such that: 


1. For every x € X there is some i with x € Uj. 
2. Forx € X and alli, j € N, ifx e€ U; NU; then x € Uxx,i,;) C Ui N Uj. 
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A weak basis for a topology ona set X is asequence V = (V;) of enumerated subsets 
of X and a function k satisfying properties (1) and (2). 


Definition 10.8.3 (Strong and weak CSC spaces). 


* A strong CSC space is a triple (X,U,k) where X C N and (U, k) is a strong 
basis for a topology on X. 

¢ A weak CSC space is a triple (X, U, k) where X C N and (U, k) is a weak basis 
for a topology on X. 


When the distinction is not important, we simply write CSC spaces. A theorem 
referring to CSC spaces is really one theorem for strong CSC spaces and another for 
weak CSC spaces. 


One motivation for considering weak CSC spaces is that RCAg is able to form a 
weak basis for a space given by a countable set X and a metric d: X — R*. 

A set W C X is open, relative to a (weak or strong) basis (U;), if for each x € W 
there is an i with x € U; € W. It is natural to ask whether we can obtain a suitable 
i from x and U. The following proposition characterizes when this is possible; the 
proof is Exercise 10.9.21. 


Proposition 10.8.4 (Dorais [71]).. The following is provable in RCAo. Let X be 
a (weak or strong) CSC space with basis (U;). If U © X is a nonempty set or 
enumerated set, the following are equivalent. 


1. U is an effective union of basic open sets: there is an enumerated set J € w with 
U= Ujes Uj. 

2. U is uniformly open: there is a partial function n: X — N, which we will call a 
neighborhood function, such that if x € U then n(x) | and x € Unix) © Uz 


A set U with those equivalent properties is effectively open. We can perform many 
manipulations with effectively open sets in RCAo. 


Proposition 10.8.5. The following are provable in RCAo. Let X be a CSC space. 


1. IfU,V © X are effectively open, then so are UNV and U UV. 

2. If (U;) is a sequence of effectively open sets and n(k,x) is a partial function 
such that Ax.n(k, x) is a neighborhood function for Ux for each k, then \) U; is 
effectively open. 


A key aspect of CSC spaces is that we can represent a function from a CSC 
space X to a CSC space Y directly in second order arithmetic. We can thus define 
continuous functions using the usual topological definition. The next definition and 
proposition describe an effective version of continuity, analogous to the distinction 
between open and effectively open sets. 


Definition 10.8.6 (Effectively continuous maps). Let (X, (U;), k) and (Y, (V;), 1) 
be a CSC space. A function f: X — Y is effectively continuous if there is a partial 
function yg: X x N — N such that, if f(x) € V; then v(x, j) | and x € Uyix,j) © 
FV). 
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Proposition 10.8.7. The following is provable in RCAo. If X and Y are CSC spaces, 
f: X — Y is effectively continuous, and V is an effectively open subset of Y then 
f-'(V) is an effectively open subset of X. 


The preceding definitions allow us to study many theorems of general topology, 
within RCAg, in the restricted setting of CSC spaces. For example, Dorais [71] has 
obtained results on discrete spaces and Hausdorff spaces. We state the following 
definition and theorem which examine the compactness of CSC spaces. 


Definition 10.8.8 (Basically compact space). The following definitions are made 
in RCAo. 


¢ A CSC space X is basically compact if, for every enumerated set J with X = 
Urer Ui, there is a finite F C J with X = Ujer Ui. 

¢ ACSC space X is sequentially compact if every sequence of points in X has an 
accumulation point (in the usual topological sense). 


Theorem 10.8.9 (Dorais [71]). RCAo proves that every sequentially compact strong 
CSC space is basically compact. Moreover, the following are equivalent over RCAg + 
BE. 

2 


I. ACAo. 
2. Every sequentially compact weak CSC space is basically compact. 


Many interesting questions about the reverse mathematics of CSC spaces are 
open, including questions on compactness and metrizability of these spaces. 

The collection of ordered spaces has been of particular interest in topology. It has 
also received attention in reverse mathematics. For example, given a linear (L, <,), 
RCAg can form a strong CSC space for the order topology on L. 

Shafer [277] studied aspects of compactness for ordered spaces. He uses the term 
compact with respect to honest open covers to refer to the spaces we call basically 
compact. The “honesty” is that the cover comes with an explicit enumeration of 
which basic open sets it uses. 


Theorem 10.8.10 (Shafer [277]). The following are equivalent over RCAo. 


1. WKLo. 
2. For every linear order L on N, if L is complete then the order topology on L is 
basically compact as a CSC space. 


A weaker kind of cover would be a sequence (U;) of open sets that covers the 
space, with no additional information on which basic open sets are contained in 
the sets U;. Shafer [277] also obtains equivalences between ACAo and forms of 
compactness using these weaker covers. 
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10.8.2 MF spaces 


Our second representation for non-metric topological spaces uses a variation of Stone 
duality to create a topology on the set of maximal filters of an arbitrary partial order. 
This class of MF spaces includes all complete separable metric spaces and additional 
spaces. Several theorems about MF spaces have high reverse mathematics strength. 
In particular, there is a metrization theorem for these spaces that is equivalent to 
1, comprehension over TI}-CAo. This is one of very few theorems known to have a 
strength higher than TI}-CAo. 

Recall the definition of a filter in the context of forcing (Definition 7.2.7). This is 
a general definition of partial orders, which we will use here in a different way. We 
recall the terminology, for convenience. 


Definition 10.8.11 (Filters and maximal filters). Let (P, <p) be a partial ordering. 


1. Aset F € Pisa filter if it is closed upward (q € F and q <p p implies p € F) 
and consistent (if p,q € F there exists r ¢ F withr <p p andr <p q). 
2. A filter is maximal if it is not a proper subset of any other filter. 


Definition 10.8.12 (MF spaces). Let P be a partial ordering. The space MF(P) has 
as its points the set of all maximal filters on P. For each q € P there is a basic open 
set N, = {f € MF(P): ¢ € f}, and the topology on MF(P) is the one generated by 
this basis. 

A space of the form MF(P) is an MF space. If P is countable, the space is said 
to be countably based. 


ME spaces are always 7;, but not always Hausdorff. Even Hausdorff MF spaces 
need not be metrizable. However, as the next example shows, every complete metric 
space is an MF space. 


Example 10.8.13. Suppose that Aisa complete metric space. We can construct an 
MF space homeomorphic to A as follows. Let P consist of all rational open balls 
(a,r) where a € A andr € Q’. Set (a,r) <p (b,s) if and only if d(a,b) +r < 5; 
this relation is sometimes known as formal inclusion. Then MF(P) is homeomorphic 
to A. 


Example 10.8.14. The Gandy—Harrington topology is the topology on w® with the 
topology generated by the collection of all lightface Py sets. This space has been 
studied for its applications to descriptive set theory (see, e.g., Kechris [177]). The 
set w® with the Gandy—Harrington topology is homeomorphic to a countably based 
MF space. This topology is not regular, however, and therefore not metrizable. 


Theorem 10.8.15 (Mummert and Stephan [227]). The class of MF spaces has the 
following topological properties. 


1, An MF space X is homeomorphic to a countably based MF space if and only if 
X is second countable. 
2. The class of MF spaces is closed under arbitrary topological products. 
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3. The class of MF spaces is closed under taking G 5 subspaces. 

4. Every MF space has the property of Baire. 

5. Every countably based Hausdorff MF space has either countably many points 
or contains a perfect closed set. 


Mummert and Stephan obtained a characterization of the second countable MF 
spaces using a particular topological game. 


Definition 10.8.16 (Strong Choquet game, Choquet [44]; see Kechris [177]). The 
strong Choquet game is a two-player game played with a fixed topological space X. 


1. On the first move, Player I chooses an open set Up and a point xo € U. 

2. Player II then chooses an open set Vo with x € Vo € Up. 

3. On a subsequent move, say move k + 1, Player I chooses an open set Uz; anda 
point x € Ugs, with x41 GC Ugar SG Ve. 

4. Player II then chooses an open set Vx4; with x41 € Vert G Ue. 


Play continues for w rounds. At the end, Player I wins if (), Ux is empty (this is 
equivalent to ()\, Vx being empty). Player II wins otherwise, if ();, Ux; is nonempty. 

A topological space has the strong Choquet property if Player II has a winning 
strategy for the game on that space. 


Strong Choquet games are a particular kind of game of perfect information content, 
discussed further in Section 12.3. Choquet introduced his game to characterize 
complete metrizability of metric spaces. 


Theorem 10.8.17 (Choquet [44]). A separable metric space has a topologically 
equivalent, complete metric if and only if the space has the strong Choquet property. 


The strong Choquet property cannot be directly formulated in second order arith- 
metic, and therefore we cannot study its reverse mathematics directly. We can use 
the property to understand the class of MF spaces, however. Every MF space has 
the strong Choquet property, and beyond a small amount of separation the strong 
Choquet property is sufficient to characterize countably based MF spaces. 


Theorem 10.8.18 (Mummert and Stephan [226]). A topological space is homeo- 
morphic to a countably based MF space if and only if it is second countable, T; and 
has the strong Choquet property. 


Mummert and Stephan [227] also showed that a second countable space domain 
representable (via a dcpo) if and only if the space is 7; and has the strong Choquet 
property. We omit the definitions here. Representability via a dcpo is a key topic in 
domain theory, a field in topology and computer science that has been applied to 
and motivated by the semantics of programming languages. Domain theory does not 
lend itself directly to reverse mathematics analysis in second order arithmetic due 
to the complexity of the basic definitions. The tie between domain theory and MF 
spaces suggests that more analysis may be possible. 

The characterization results above make essential use of second countability. It 
is known there are Hausdorff spaces with the strong Choquet property that are not 
homeomorphic to MF spaces. 
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10.8.3 Reverse mathematics of MF spaces 


It is possible to formalize MF spaces in second order arithmetic. To do so, we need 
to represent both the spaces and the continuous functions between them. 

The representation of an MF space within RCAg is immediate. We begin with 
a countable partial ordering P = (N, <p). Each point in MF(P) is represented as 
a maximal filter, which is now a subset of N, similar to the way that a point in a 
complete separable metric space is coded as a subset of N. One key difference is that 
the definition of a quickly converging Cauchy sequence is I°, while the definition 
of a maximal filter is II. 

The representation of continuous functions between MF spaces is inspired by a 
basic property of continuous functions: if f: X — Y is continuous and z € X, then 
f(%) is in an open set V C Y if and only if there is an open set U C X with z €¢ U 
and f(U) C V. The following definition is made in RCAg. 


Definition 10.8.19 (Coded continuous function). A code for a continuous function 
between MF(P) and MF(Q) is a subset F of N x P x Q. Each code induces at least 
a partial map f from MF(P) to MF(Q) in which 


f(x) ={q € Q@: (An)(Ap € P)[(n, p,q) € FI} 


when this set is a point in MF(Q), and f(x) is undefined otherwise. A coded 
continuous function is a function that has a total code. 
Spaces MF(P) and MF(Q) are homeomorphic if there is a coded continuous 
bijection from MF(P) to MF(Q) whose inverse is also a coded continuous function. 
A space MF(P is homeomorphic to a complete separable metric space if there 
is a complete metric space A such that there is a homeomorphism between MF(P) 
and the MF space constructed from Aas in Example 10.8.13. 


It can be shown in ZFC that every continuous function from a countably based 
MF space to a countably based MF space has a code. The role of N in the definition 
is to allow RCAo to construct the composition of two coded continuous functions. 

Mummert and Simpson [226] obtained the following characterization analo- 
gous to Urysohn’s metrization theorem. We will provide only a broad sketch of 
the argument here. See Mummert and Simpson [226] for a longer sketch, and see 
Mummert [225] for full proofs and additional results. 


Theorem 10.8.20. The following are equivalent over II} -CAo. 


L, II, comprehension. 
2. Every regular, countably based MF space is homeomorphic to a complete sep- 
arable metric space. 


Proof (sketch). The forward implication is a formalization of the proof of Urysohn’s 
metrization theorem and Choquet’s characterization of complete metrizability. In 
this part of the proof, given a regular, countably based space MF(P), we use I, 
comprehension to form the set S of pairs (p,q) € P Xx P such that the closure of 
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N, is contained in N,. For a complete separable metric space, the property that the 
closure of one open ball is contained in a second open ball is TI; complete in general; 
for MF spaces the analogous relation is formally I. 

With the set S in hand, it is possible to produce a metric for MF(P), using 
methods inspired by work of Schréder [273] in effective topology. This metric 
may not be complete, however. A second phase of the proof interpolates ideas 
from Choquet’s characterization theorem to show that every countably based MF 
space that is metrizable is completely metrizable. This two-step proof provides one 
direction of the theorem. 

For the reversal, we work in II; -CAo and begin with a 2 formula y(7). We want 
to form the set S = {n € N : y(n)}. The reversal constructs a particular partial 
order P so that MF(P) is regular, and moreover there is a closed set C in MF(P) 
whose points are in effective correspondence (relative to ACAg) to the elements of S. 
Moreover, the construction ensures that every point of C is an isolated point relative 
to C. 

Intuitively, the construction ensures that, in order to construct S, it is sufficient to 
enumerate a dense subset of C. This set will be C itself, and by the correspondence 
between C and S we are able to produce S. The construction depends on Kondo’s 
theorem (IT uniformization), which is provable in TI}-CAo. 

Having constructed P and verified it is regular, by assumption we obtain a com- 
plete metric space A and a homeomorphism between MF(P) and A. This allows us 
to find a closed set C’ € A that is homeomorphic to C. Because we are working 
in TI} -CAo, we can apply Theorem 10.5.8 to show that C’ is separably closed. This 
gives an enumeration of the points of C’, which allows us to enumerate C and thus S, 
completing the reversal. oO 


It is also possible to study the descriptive set theory of MF spaces, although even 
basic properties quickly lead to set theoretic results. We will sketch a few additional 
results on the reverse mathematics of descriptive set theory in the final chapter. 

It is also possible to study the descriptive set theory of MF spaces, although even 
basic properties quickly lead to set theoretic results. We will sketch a few additional 
results on the reverse mathematics of descriptive set theory in the final chapter. For 
the definition of L, see Section 12.3.3. 


Theorem 10.8.21 (Mummert [225]). The proposition that every closed subset of a 
countably based MF space is either countable or has a perfect subset is equivalent 
over TI; -CAo to the principle that ike is countable for every A CN. In particular, 
the principle is independent of ZFC and false if V = L. 


The principle that every countably based MF space is either countable or has a 
perfect subset is provable in ZFC, but the reverse mathematics strength is not known. 
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10.9 Exercises 


Exercise 10.9.1. We will call a function f: N — Q* a modulus of convergence if 
f is strictly decreasing and limp. f(n) = 0. If f is a modulus of convergence, a 
sequence (q,,) of rationals is f-quickly converging if |gn—-—qm| < f(m) whenn > m. 
Thus a quickly converging Cauchy sequence is h-quickly converging for h(n) = 27”. 


1. RCAo proves that if f and g are moduli of convergence, for every f-quickly 
converging Cauchy sequence there is a g-quickly converging Cauchy sequence 
with the same limit. 

2. If f and g are moduli of convergence there is a uniform procedure, computable 
relative to f and g, the converts each f-quickly converging sequence to a g- 
quickly converging sequence. 


Exercise 10.9.2. As usual, a sequence (x,) in a complete separable metric space 
A is quickly converging if d4(Xxn,Xm) < 27” when n > m. Prove in RCAo that 
every quickly converging sequence (x,,) in A converges. That is, there is a quickly 
converging sequence (r,,) of points in A with the same limit as (x,). 


Exercise 10.9.3. An (open) Dedekind cut is a set X of rational numbers so that: 


1. X is nonempty and X # Q. 
2.Ifa,beQaeXandb<athenbe x. 
3. X has no maximum element. 


As usual, an Dedekind cut X represents the real number sup(X), and every real is 
represented by some Dedekind cut. 

Show that the operation +p of addition on Dedekind cuts is not uniformly com- 
putable. In particular, the Turing jump operation is Weihrauch reducible to the 
parallelization +p. (Similar results hold if we do not require the Dedekind cuts to be 
open.) 


Exercise 10.9.4 (Simpson [288, II.4.2]). The following nested completeness theo- 
rem is provable in RCAg. Assume that (a; : i € N) and (b; : i € N) are sequences 
of real numbers such that, for all n, dn < Gnz1 < Dna < by, and lim |b, — ay| = 0. 
Then there is a real number x such that lima, =x = limb,. 


Exercise 10.9.5. RCA proves that R is uncountable: there is no sequence (7;) of 
real numbers that includes a code for every real number. 


Exercise 10.9.6. Let (C; : i ¢ N) be a sequence of nonempty closed sets in 2 with 
Ci41 © C; for each i. Prove in WKLo that ()\,; C; is nonempty. 


Exercise 10.9.7. Prove Example 10.3.5. 
Exercise 10.9.8. Prove Example 10.3.6. 


Exercise 10.9.9. Prove Lemma 10.3.7. 
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Exercise 10.9.10. Prove in ACAo that every bounded sequence of real numbers has 
a least upper bound. 


Exercise 10.9.11. Prove in ACAo that, in a compact metric space, every closed set is 
separably closed. 


Exercise 10.9.12. Prove in ACAo that, in a complete separable metric space, every 
separably closed set is closed. 


Exercise 10.9.13 (see Avigad and Simic [11]). A set C in a complete separable 
metric space A is located if the distance function d(x,C) = inf{d(x, y) : y € C} 
exists. Prove the following in RCAo. 


L.IfAisa complete separable metric space then a set C C A is closed and located 
if and only if the set {(a,r) € Ax Q : B-(a) NC = O} exists. 
2. ACAg is equivalent to the principle that every closed set in a compact metric 
space is located (Giusto and Simpson [124]). 
. ACAg is equivalent to the principle that every separably closed set in a complete 
separable metric space is located (Giusto and Marcone [123, Theorem 7.3]). 


WwW 


Exercise 10.9.14. The following results demonstrate several aspects of the nonuni- 
formity of the principle IVT. 


1. The following principle is equivalent to WKLo over RCAg: Given a sequence of 
coded continuous functions f;: [0,1] — R with f;(0) < 0 and f,(1) > 0 for 
all 7, there is a sequence (x;) of points in [0, 1] with f;(x;) = 0 for all 7. 

2. Uniformly computable solutions are admitted for the special case of IVT where 
the function f has only one root. (Hint: Use a trisection argument. For example, 
because f(1/3) and f(2/3) cannot both be zero, by approximating both with 
enough accuracy we will eventually determine that one or the other is nonzero.) 

. Uniformly computable solutions are admitted for the special case of IVT where 
the set {x : f(x) = 0} is nowhere dense. 


WwW 


Exercise 10.9.15. Prove in RCA that the following statement implies ACAg: In NN, 
every nonempty closed set is separably closed. 


Exercise 10.9.16 (Simpson [288, IV.1.7, IV.1.8]). The following are provable in 
WKLo. Let X be a compact metric space. 


1. The property that a closed set C is nonempty is expressible by a m1? formula 
with a parameter for the code for C. 

2. If (C; : i € N) is a sequence of nonempty closed sets in X, there is a sequence 
(x; :1 € N) of points of X with x; € C;. 


Exercise 10.9.17. Prove the following: 


1. Lemma 10.5.10 (Use Corollary 10.5.4). 
2. Theorem 10.5.12 (Use a method similar to Theorem 10.5.11). 
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3. Prove that WKLo is equivalent over RCAo to the principle that every bounded 
function f: [0,1] — R* achieves a maximum. (Combine the method of Theo- 
rem 10.5.11 with a Specker sequence.) 


Exercise 10.9.18. Prove in WKLo that, given a continuous function code F for a total 
function from 2" to itself, there is a function f: 2<“ — 2<N that encodes f as 
in Definition 10.6.2. 


Exercise 10.9.19. Working in WKLo, prove the claim in the proof of Lemma 10.6.4. 
Exercise 10.9.20. Prove Lemma 10.6.6, 
Exercise 10.9.21. Prove Proposition 10.8.4. 


Exercise 10.9.22. Characterize the strength of the principles “every open set in a 
strong CSC space is effectively open’ and “every open set in a weak CSC space is 
effectively open’. 


Chapter 11 ® | 
Algebra wea 


Computable algebra forms a core area of research in modern computability theory. It 
has a long history, stretching back to the first half of the 20th century, with pioneer- 
ing early papers by Frohlich and Shepherdson [119], Rabin [251], Mal’cev [200], 
Seidenberg [276], and Ershov [101]. Naturally, there is also a significant intersection 
with reverse mathematics, which we explore in this chapter. An interesting caveat is 
that, perhaps more so that in any other subject, many applications of computability 
to algebra are not readily accessible to reverse mathematics. This is particularly the 
case for computable structure theory. Questions about degree spectra, categoricity, 
etc., do not lend themselves naturally to investigation within our framework. But, of 
course, they often find at least partial reflections in questions that do. 


11.1 Groups, rings, and other structures 


Throughout this chapter, all structures we will consider will be countable, and we 
will omit repeating this explicitly. Under this assumption, we can represent common 
algebraic structures for consideration in computability theory or reverse mathematics. 
Notably, we have the following. 


Definition 11.1.1 (Groups and rings). 
1. A group is a tuple (G, *G, eg) such that: 


¢ Gis a subset of w, 
* xg is a function G* > G, 
*eg €G, 


and such that the group axioms are satisfied. 


2. A ring is a tuple (R, +r, -R, OR, LR) such that: 


¢ Risa subset of w, 
* +r,-r are functions R* — R, 
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° Or, lr EG, 
and such that the ring axioms are satisfied. 


Fields are rings, so they do not require separate representation, though we will typi- 
cally write a field as (K,+x,-x,0x, 1x) in keeping with more customary notation. 
We define subgroups and ideals in the obvious way, with ideals always meaning 
proper ideals (not containing the unity). In Section 11.5, we will consider a subtly 
different way of defining subgroup in a specific example, but otherwise we will use 
the definitions above. 

We will follow all usual conventions. For example, we may refer to a group 
(G, *G, eg) or ring (R,+R,-R, OR, 1x) simply as G or R, for short. If G is abelian, 
we will write it instead as (G,+G,0G), and use —a for the +G-inverse of a € G. 
Similarly in R, we will write —r and r—! for the inverses of r € R under +p and -p, 
respectively. In this case we will also write, e.g., a —r b for a+r (—b), etc. 

Various other common algebraic objects and constructions can be accommodated 
in computability theory and, by extension, RCAg. For example, if J is an ideal of a 
ring R, the quotient ring R/TI is defined as the set of all r € R such that there is no 
r* <r (in the standard order of the natural numbers) with r —p r* € J. Addition and 
multiplication in R/J can be defined in a straightforward manner so as to give R/I 
an effective ring structure: e.g., if ro,r1 € R then ro -r/7 1 is the least r € R such 
that r — (ro -r 1) € I, etc. Note that R/J exists by AS comprehension, and RCAg 
can prove that it is a ring. Similar constructions work for other kinds of quotient 
structures. 

In RCAg, we can also consider finite products of algebraic structures. For example, 
if G is a group and a € G<N we define []gcq a inductively on the length of a: if 
la| = Othen [[geq a = 1a; givenn €N, having defined [],,<,: a for all a* of length 
n, and given a of length n+ 1, we define []gca @ = Tlacatn@ *G a(n). RCA 
suffices to prove the usual properties of this operation, such as that if G is abelian and 
a, B € GN have the same range then [|jcq 4 = Tlaeg a (see Exercise 11.7.2). In 
this case, we may unambiguously write this instead as >) ,<, a, for F a finite subset 
of G, as is customary. 

Also important are formal sums. As we will see, polynomial rings play a big role 
in facilitating reversals in the reverse mathematics of algebra. 


Definition 11.1.2 (Polynomials). Let K be a field and fix n € w. 


1. A monomial (in n variables) is an element m of w”. 
2. A polynomial (over K inn variables) is an element p of w<® as follows: 


¢ For alli < |p|, p(i) = (k,m), where k € K and m is a monomial in n 
variables. 

¢ For all i < j < |p|, if p(@) = (ko,mo) and p(j) = (ki,m,) then mo 
lexicographically precedes m,. 


We are representing K[xo,...,Xn—1] here. The idea is that a monomial m represents 
ae a A polynomial p with p(i) = (k;,m;) for all i < |p| represents 


the formal sum });<),| kimi. We will use this more familiar notation, with suitable 
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accompanying terminology. For example, for each i < |p| we refer to p(i) as a term 
of p. If m is amonomial with m(i) = j € N for somei < |m| we say Ef is a factor of 
m. We identify each monomial m as the single term polynomial 1x m, and for each 
k € K identify the single term polynomial kxo - = with k. From here, we can 
define scalar multiplication, addition, and multiplication for polynomials as usual, 
and it is easy to see that these operations are computable. That is, the map taking 
(codes for) polynomials p and q to (a code for) their product is a computable map 
WwW — w, etc. 


Definition 11.1.3 (Polynomial rings). Let K be a field. 


1. For n € w, K[x0,...,Xn-1] or K[x; : i < n] is the structure with domain the 
set of all (codes for) polynomials over K in n variables, with the usual ring 
operations and identities. 

2. K[xo,x1,...] or K[x; : i € w| is the structure with domain the set of all (codes 
for) polynomials over K in n variables for some n > 1, with the usual ring 
operations and identities. 


These definitions directly formalize in RCAo, which can prove that K[x; : i < n] and 
K[x; : i € N] are indeed rings. 

A thorough survey of coding techniques for representing algebraic objects can be 
found in the pioneering works of Rabin [251] and Frohlich and Shepherdson [119]. 
The main takeaway is that, as in our earlier examples, these codings are natural 
and benign in the sense that once we have made them we can largely forget about 
them. Worth noting, too, is that many small variations on such codings are possible. 
Typically, the specific choice of such variation makes no difference. 


11.2 Vector spaces and bases 


Before venturing off to abstract algebra for the remainder of this chapter, we take a 
brief excursion to linear algebra. Our aim is to prove the following result, in large 
part to give a first impression of some of the coding techniques commonly employed 
in the reverse mathematics investigation of algebraic theorems. 


Theorem 11.2.1 (Friedman, Simpson, and Smith [117], after Dekker (unpub- 
lished) and Metakides and Nerode [209]). The following are equivalent under <, 
and over RCAo. 


1. TJ. 
2. Every vector space over a field K has a basis. 
3. Every vector space over Q has a basis. 


Hence, (2) and (3) is equivalent over RCAg to ACAo. 


Here, we use Q to refer to the field (Q,+,-,0,1), as usual. We begin with an 
elaboration on Definition 11.1.1. 
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Definition 11.2.2 (Vector spaces). Let K be a field. A vector space over K is an 
abelian group (V,+y, Oy) together with a function -y: K x V — V satisfying the 
axioms of scalar multiplication. 


We can add some standard terminology. Given a finite set F C V and u € V, we say 
v is a linear combination of (the vectors in) F if for each b € F there exists kp € K 
with v = iner kpd; the linear combination is nontrivial if k, # Oforsome b € F.A 
finite set F C V is linearly dependent if Oy is a nontrivial linear combination of F; 
otherwise, F is linearly independent. Finally, a basis for V is a set B C V satisfying 
the following usual properties. 


¢ Every finite subset of B is linearly independent. 
¢ Every v € W is a linear combination of some finite subset of B. 


RCAp can verify the standard fact that for every nonzero v € V there is a unique 
finite F C B and coefficients k, such that v = )iper kpb and ky # Ox for all 
b € F. We will call F the canonical representation of v in terms of the basis B. (See 
Exercise 11.7.1.) 


Proof (of Theorem 11.2.1). We prove the equivalence over RCAg. 


(1) — (2): We argue in ACAo. Let V be given. If there is a finite set B C V that forms 
a basis for V, we are done. So assume not. Using arithmetical comprehension we 
inductively define a sequence (b; : i € N), as follows. Let bo € V be arbitrary. Then 
fix n > 0, and assume that we have defined b; for alli < n and that F = (b; : i <n) is 
linearly independent. By assumption, there is a v € V that is not a linear combination 
of F. Choose the least such v and let b, = v. Now FU {by} is easily seen to be linearly 
independent, so the inductive assumption is maintained, and B = {b; :i €¢ N} isa 
basis for V. 


(2) — (3): Obvious. 


(3) — (1): We now argue in RCAo. Fix an injective function f: N — N. Our task is 
to show that range( f) exists. We will work with the set of all formal sums )};<, GiXi 
over Q, which naturally forms a vector space V over Q. Our coding of sequences 
from Chapter 5 ensures that for each i, the code of the vector x; is minimal among 
the codes of vectors )i;<, Gix; With g; #0. 

For each j € N, let x = xo¢(jy) + VG + Lx2¢(j)41. Let Up be the subspace of 
V generated by the linear span of {x’. : 7 € N}. Then Up exists, because a vector 
Dien Wixi belongs to Uo if and only if the following conditions hold for all i: 


° If 2i < n and q2; # 0 then 2i+ 1 < n and qaj41 = (J + 1)q2; for some 7 with 
fj) =i. 
e If 2i+1 <n and qo; = 0 then qgo;4; = 0. 


Furthermore, Bo = {x’ : J © N} forms a basis for Up. 

Let V; be the quotient space V/Up (using minimal representatives, as discussed in 
the previous section). Applying (3), we may fix a basis B; for V;. Then B = Bo U By 
is a basis for V. To see this, we verify the clauses in the definition of a basis. 
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¢ A finite set F C B may be written as Fp U F;, where Fo € Bo and F; € B,. Each 
of Fo and F; are linearly independent. Hence, if F were linearly dependent, then 
some nontrivial linear combination of F; would be an element of Uo. But this 
means that, in Vj, this linear combination is equal to 0, and so F{ is linearly 
dependent after all. 

¢ Fix any v € V. We may then fix the unique w such that w € V; andv—w € Up. 
(If v € Uo then w = 0.) Fix Fo € Bo and F, © By, such that v — w is a linear 
combination of Fo and w is a linear combination of F,. Then v is a linear 
combination of Fo U F). 


To complete the proof, let R be the set of all i € N such that either the canonical 
representation of x2; in terms of B or the canonical representation of x2;+; in terms of 
B contains x’. € Bo for some j with f(j) =i. Clearly, R exists by A comprehension, 
and we claim that R = range(f). The containment R C range(f) is obvious. 


Conversely, suppose i € range(f), say with f(j) = i. Thus, xi =xpfijptUt 
1)x2¢(j)41 = Xai + (J + 1)X2i41 belongs to Up, and in fact to Bo. In particular, the 


canonical representation of x in terms of B is {x5}. We now consider two cases. 


Case 1: x2; € V;. Let F be the canonical representation of x2; in terms of B. By 
assumption, F C B,. Since x’. ¢ B, and canonical representations are unique, it 
follows that F U {xi} is the canonical representation in terms of B of (j + 1)x2;41 = 
x —Xp;. But then F' U {x’.} is also the canonical representation of x2;,), since this is 
just a nonzero scalar multiple of (j + 1)x2;+1. 


Case 2: x2; ¢ V;. Since V; is a quotient space, there exist vectors v € V; and u € Uo 
such that v < x2; and x2; — v = u. By our assumption on the order of codings, v 
cannot contain x2;. By definition of Up, this means that u = w + x’. for some w that 
does not contain x2; of x2;,;. Let F,, be the canonical representation of v in terms 
of B, and F,, the canonical representation of w in terms of B . Thus, F, € B, and 
Fy C Bon {xi}. Since Bo and B, are disjoint, it follows that F, U Fy U {xi} is the 
canonical representation of x2; = v+w+t xj. 


In either case, we see that i € R, as was to be shown. oO 


The implication from (1) to (2) featured the initial case analysis about whether or 
not V is “finite dimensional”, meaning “has a finite basis”. If so, then of course the 
theorem is provable in RCAg (trivially). But we could also define “finite dimensional” 
to mean “there exists a n such that every finite set F C V of size n is linearly 
dependent”. The assertion that every such space has a basis is computably true 
(since every solution is, in particular, a finite set). However, this is no longer as 
straightforward to prove. 


Theorem 11.2.3 (Hirst and Mummert [160]). The following are equivalent over 
RCAo. 


J. 1S}. 
2 
2. Let V be a vector space such that, for some n € N, every finite set F C V of size 
n is linearly dependent. Then V has a basis. 
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Finite dimensional vector spaces also turn out to be interesting when viewed under 
some of our stronger reducibilities. Consider the following family of problems. 


Definition 11.2.4 (Hirst and Mummert [160]). Fix n > 2. VSB,, is the problem 
whose instances are all n-dimensional vector spaces over Q, with the solutions to 
any such vector space being all its bases. 


As an Vo theorem, VSB,, is vacuous. But not so as a problem under <w. Recall the 
choice problem Cy from Exercise 4.8.10. 


Theorem 11.2.5 (Hirst and Mummert [160]). For all n > 2, VSB, =w Cn. 


This is asomewhat surprising result, perhaps more so by virtue of the fact that it holds 
for all n > 2. That is, the specific dimension of a finite dimensional vector space 
does not affect the uniform computational complexity of finding a basis. This stands 
in contrast to some results we have seen earlier involving parameterized principles, 
where the specific value of the parameter was more significant. For example, recall 
Proposition 4.3.7, stating that RT; <w RTI whenever k > j. 

The proof of Theorem 11.2.5 is an elaboration on that of Theorem 11.2.1. By 
taking a suitable quotient of the vector space V; constructed above, one obtains a 
2-dimensional vector space every basis of which can be used to select an element 
not enumerated by a given instance of Cy. 


11.3 The complexity of ideals 


We begin by considering two classical theorems of ring theory whose analysis 
in reverse mathematics is by now classical in its own right, and which is often 
extolled as a hallmark of the subject’s capacity to reveal hidden differences between 
mathematical results. These theorems concern the existence of prime and maximal 
ideals in commutative rings. 

Recall that an ideal J of aring R is 


¢ prime if forallg,heG,ifg-rhelthngelorhel, 
¢ maximal if there is no ideal /* of R such that J C /* and J # I*. 


Every commutative ring has both a prime and maximal ideal. The typical way to 
prove this is as follows. First, observe that every maximal ideal is prime, so it suffices 
to show the existence of the former. To this end, note that any union of ideals is still 
an ideal, and so a maximal ideal exists by an application of Zorn’s lemma. 

The above is a completely nonconstructive proof, and of course it completely 
obscures any potential distinction between building a maximal ideal and building a 
prime ideal. As it turns out, a distinction does exist, and it can be measured exactly 
using subsystems of Z>. 

We begin by calibrating the strength of the principle asserting the existence of 
prime ideals. 
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Theorem 11.3.1 (Friedman, Simpson, and Smith [117, 118]). The following are 
equivalent under <, and over RCAo. 


7. WKL. 
2. Every commutative ring has a prime ideal. 


Proof. We focus on the equivalence over RCAo. The equivalence under computable 
reducibility is similar, but actually quite a bit easier, as we comment on below. 


(1) — (2): We argue in WKLo. Let R be an infinite commutative ring, say with 
domain {ro,r1,...}, where ro = Og and r; = 1p. Define a tree T C 2<% by 
letting o € T if and only if 


« o(0) =1, 

* o(1)=0, 

* for all x,y,z < |o| with r, =ry +r ry, if o(x) = o(y) = 1 then o(z) = 1, 
* forall x,y,z < |o| with r, =rx -rry, if o(x) = 1 then o(z) = 1, 

* for all x,y,z < |o| with rz, =rx-r ry, if o(x) = o(y) = 0 then o(z) = 0. 


We will show that T is infinite. Once we do this, we can apply WKL to fix a path f 
through T. Then J = {r, : f(x) = 1} is Me definable from /f, and it is easy to verify, 
using the properties above, that it is a prime ideal. For instance, ifr, =r, -rry € 1 
then f(z) = 1, so by definition either f(x) = 1 or f(y) = 1, meaning r, € J 
orry El. 

To show that T is infinite, we begin by defining a function p from 2< to (codes 
for) finite subsets of N by induction on o € 2<“. Let p(()) be (a code for) the finite 
set {Or}, and assume inductively that p(o) is defined and equal to (a code for) a 
finite set for some o € 2<". Now fix x, y,z,k EN. 


° If |o| = 4(x, y,z,k) with rz = ry, +r ry andr y,ry € p(o), then let p(o0) = 
p(o) U {rz} and p(al1) = @. 

° If |o| = 4x, y,z,k) +1 with rr, =r, -rry andr, € p(o), then let p(a0) = 
p(o) U {rz} and p(a1) = @. 

° If |o| = 4x, y,z,k) +2 with rz = rx -rry and rz € p(o), then let p(o0) = 
p(o) U {rx} and p(ol) = p(o) U {ry}. 

° If |o| = 4(x, y, z,k) +3 with le € p(c) then let p(o0) = p(a1) = ©. 


Otherwise, let p(a0) = p(o1) = p(a). This completes the definition of p, which 
exists by A° comprehension. 

Let U be the set of all o € 2<% such that p(o) # @. By m1 induction on a, if 
o € 2<N has length n then x < n for all x € p(c). So U exists, and it is readily seen 
that U is a tree. We claim that it is infinite. To this end, we prove the stronger fact 
that for every n, there is a 7 € 2<N of length n such that @ # p(c) and (p(c)), 
the subring of R generated by the elements of p(c), does not contain 1. Note that, 
as there are only finitely many strings of length n, the condition to verify is actually 
m1 in n. Hence, we can again proceed using m1 induction. The result is clear for 
n = 0, so fix n and assume the result is true for this n. Fix a witnessing string o. 
If |o| = 4(x, y, z, k) or |o| = 4(x, y, z, k) + 1, then p(o0) adds at most the sum or 
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product of two elements already in p(o), and hence (p(a-0)) = (p(c)). In this case, 
a0 witnesses the result for n+1.If|o| = 4(x, y, z,k)+2 andr, =rx-rry € p(o), then 
either (p(a) U {rx}) = (p(a0)) or (p(o) U {ry}) = (p(@1)) does not contain 1p. 
So, either oO or a1 witnesses the result for n + 1. And if |o| = 4(x, y, z, k) +3 then 
by definition, both o-0 and a1 witness the result for n + 1. This proves the claim. 

Apply WKL to obtain a path A through U. By construction, @ # p(h fn) for all 
n. Moreover, we must have lr ¢ (p(h n)) for all n. If not, fix the largest n such 
that lr ¢ (p(h [n)). Then necessarily n = 4(x, y, z,k) +2 for some x, y, z, and k, 
and either h(n) = 0 and rz! € (p(h[n)), or h(n) = 1 and oa € (p(h [n)). Say it 
is the former; the latter case is symmetric. Fix w such that FS = Tw, and choose the 
least k so that m = 4(x,w,1,k) +2 > n. Then p(h fm) contains both r, and ry, 
and since rj; = lr = rx -R Tw by assumption, it follows that p(h [(m + 1)), being 
nonempty, equals p(h [| m) U {1p}. But then by construction, p(h [(m+2)) =@,a 
contradiction. 

Now fix any €; we exhibit an element of T of length ¢. Let b = max{r, : x < €}. 
Form the set F = {ry <b: x <€A (An)[rx € p(h fn)]}, which exists by bounded 
x? comprehension. Define o € 2<“ of length ¢ by o(x) = 1 if and only if ry € F, 
for all x < €. Then it is not difficult to check that o satisfies each of the defining 
conditions in the definition T, hence o € T. 


(2) — (1): We argue in RCAp. Fix functions f: N — Nand g: N > N with disjoint 
ranges. We use (2) to obtain a set Z such that for all x, if x is in the range of f then 
x € Z and if x is in the range of g then x ¢ Z. By Exercise 5.13.18, this establishes 
WKL. 

Consider the polynomial ring Q[x; : i € N], and consider the ideal 


J= al": ¢G) =i} UG - 1: 20) =3). 


A polynomial p € Q[x; : i € N] belongs to J if and only contains a monomial with 
a factor of the form x?, where f(j) =i or g(j) =i for some j < s. Hence, J exists 
by AY comprehension. Consequently, so does the quotient ring R = Q[x; : i € N]/J. 
Write [p] for the representative of p in R, and notice that the function p % [p] 
exists. 

Apply (2) to find a prime ideal J of R. Let Z = {i : [x;] € 1}, which exists because 
the function p + [p| exists. Ifi € range(f), say with f(/) = i, then a € J, hence 
[x;]/*! = cael = Or € I. By primeness, it follows that [x;] € 7. Thus, i € Z. On 
the other hand, if i € range(g), say with g(j) =i, then [x;] cannot belong to J else 
so would [x;|/*! = [xf] = |r. Thus, i ¢ Z. This completes the proof. im 


That (1) and (2) are equivalent under <, follows by an analogous argument. But 
the reduction of (2) to (1) is an interesting example of how a computability theoretic 
construction may be delicate to formalize in RCAo for reasons other than induction. 
Indeed, consider again the tree T € 2“ constructed in the proof of the (1) — (2) 
implication. This tree is computable in the given ring R, and as noted, every path 
through T defines a prime ideal of the ring R. But it is also easy to see that if I is 
any prime ideal of R then the characteristic function of / is a path through T. Hence, 
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since we know the classical result that R has a prime ideal, we immediately know 
that T is infinite. But the classical result is exactly what we are proving in RCAg, 
so we cannot invoke it in the course of our proof, and that is why T being infinite 
requires separate justification. (We saw the opposite phenomenon in the proof of 
Proposition 4.1.3.) The ability to use “external facts” like this can make working 
with finer reducibilities easier than working in RCAo. 


Theorem 11.3.2 (Friedman, Simpson, and Smith [117]). The following are equiv- 
alent under <,. and over RCAg. 


1. TJ. 
2. Every commutative ring has a maximal ideal. 


Hence, (2) is equivalent over RCAg to ACA. 


Proof. Here, let us prove the equivalence under computable reducibility. The argu- 
ments can be readily formalized (albeit the proof becomes longer). 


(2) <_ (1): Let R be an infinite commutative ring, say with domain {rp <r) < ---} 
where ro = Or. Define a function f from w to (codes for) finite sets, as follows: 
f (0) = {Or}, and for all x > 0, 


ae as -NULrs} ifle €(f@-1)U frx}), 


f(x-1) otherwise. 


Clearly, f <r R’. Let J = range(f). Since f is nondecreasing, we have 1 <p ROI <7 
R’. Clearly, Or € J. By induction, lr ¢ f(x) for all x, hence lr ¢ J. From here 
it is easily verified that 7 is an ideal of R. We claim that it is maximal. To this 
end, fix any r, ¢ J. Then in particular r, ¢ f(z), so by definition it must be that 
lr € (f(z- 1) U {rz}). Since f(z - 1) € J, it follows that no ideal contains all of J 
as well as rz. This proves the claim. 


(1) <. (2): We show that (2) codes the jump. Fix A € 2”. Let K be the field of 
fractions of the polynomial ring Q[x; : i € w]. Formally, for all p,g € Q(x; :i € w] 
with g # 0, let p/q denote the least pair (r,s) such that r,s € Q[x; :i € wl, s #0, 
and rg = ps. Then K is the set of all such p/q, with the obvious operations. Clearly, 
K is computable. 

Let S be the set of all polynomials g € Q[x; : i € w] that contain at least one 
nontrivial monomial in which only x; with i € A’ appear. It is easy to see that S is a 
multiplicative ring, and so 


R=(e eK: qgeS} 
q 


is a ring. (It is isomorphic to the localization of Q[x; : i € w] by S.) Of course, R 
need not be A-computable. But it is A-c.e. Therefore, we may fix an A-computable 
bijection f: w — R. (In the argument over RCAo, the existence of f follows by 
Theorem 6.1.6.) 
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We now pull back the structure in R via b to get an A-computable ring R* 
with domain w, as follows: Or+ = f7' (Or); lr = f-'(r); and for all x,y € a, 
xt+rey = fo'(f(x) +r f(y)) and x -p« y = fo! (f(x) «rp f(y)). Since +p and -p are 
computable operations (they are the same operations as in K), it follows that R* is 
computable from f and hence from A, as wanted. Moreover, R* is isomorphic to R 
via f. 

Suppose / is any maximal ideal of R*. We claim that i € A’ if and only if 
f~'(x;) ¢ I, which implies that A’ <y A @ J, completing the proof. We prove the 
equivalent fact that i € A’ if and only if x; ¢ f(J), which is notationally lighter. 
Notice that as f is an isomorphism, f(/) is a maximal ideal of R. 

First, suppose i € A’. Then x; € S and hence 1/x; € R. If x; belonged to b(/) 
then as b(J) is an ideal, so would 1/x; -p x; = 1p, which is impossible. Conversely, 
suppose x; ¢ b(/). By maximality of b(/), there must exist p/q € Randr/s € b(J) 
such that 


r 
ae = lr, 
S 


or equivalently, psx; = qs — gr. Since q and s belong to S, so does their product; 
thus, gs contains a monomial m in which only x; with j € A’ appear. By contrast, 
r € S,elser/s € b(J) would be invertible in R. So every monomial in r has a factor 
of the form x; for some j ¢ A’, and the same must consequently be true of qr. It 
follows that m cannot cancel with any term in qr, and so must also be a monomial 


of psx;. But this means that x; is a factor of m, and therefore i € A’, as was to be 
shown. oO 


Further results concerning the complexity of ideals have been obtained by 
Downey, Lempp, and Mileti [77]. For example, consider the fact that a ring has 
no nonzero proper ideal if and only if it is a field. The “if” direction of this is easily 
proved in RCAg, and the “only if” direction is easily proved in ACAo. (As pointed 
out in [77], if R is not a field then it has an element r that is not a unit, and then (r) 
is a nontrivial ideal. Since (7) is arithmetically definable, ACAo suffices to prove that 
it exists.) We omit the proof of the following result showing that, with more effort, 
WKLo suffices. 


Theorem 11.3.3 (Downey, Lempp, and Mileti [77]). The following are equivalent 
under <, and over RCAg. 


7. WKL. 
2. A ring R has no nonzero proper ideal if and only if it is a field. 


11.4 Orderability 


In this section, we turn to theorems concerning the existence of orderings of groups, 
rings, and fields. Here, an ordering is a linear ordering of the domain that is com- 
patible with the algebraic structure. The definitions are as follows. 
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Definition 11.4.1. 


1. A group (G, *G, eg) is orderable if there exists a linear ordering <g of G such 
that for all a,b € G,ifa <g bthena*gc <g b*Ggcandc*ga <GgcxGgb 
for allc € G. 

2. A ring (R,+r,-R, OR, lr) is orderable if there exists a linear ordering <r of R 
such that for all r,s € R: 


eifr <r sthenr+rt<rstrt forallte R, 
eifr <r sthenr-rt <r S-rtandt-rr <xrt-rs forallt € Rwitht =r Or. 


3. A field is orderable if it is orderable as a ring. 


A number of classical results in group theory and field theory from the early part of 
the 20th century (referred to by Metakides and Nerode [209] as the “Steinitz—Artin 
period” in algebra) concern conditions under which different algebraic structures 
admit orderings. 

One prominent such result, due to Artin and Schreier [3], is that every formally 
real field is orderable. Recall that a field (K,+x,-x,0x, 1x) is formally real if there 
isno k € K such that k-« k +x 1x = Ox. (Note that if K is orderable them it is 
necessarily formally real, so the Artin—Schreier result is actually a characterization. 
But the converse direction is trivial.) Ershov [101], and independently Metakides 
and Nerode [209], proved the existence of a computable formally real field having 
no computable ordering. Hence, as a V5 theorem, the Artin—Schreier result is not 
computably true. Metakides and Nerode [209] actually showed more. Namely, they 
proved that the class of orderings of a field forms a my class in 2”, and conversely, 
every nonempty ny class is computably homeomorphic to the class of orderings 
of some computable field. Building on this, Friedman, Simpson, and Smith [117] 
obtained the following equivalence. 


Theorem 11.4.2 (Friedman, Simpson, and Smith [117]). The following are equiv- 
alent under <,. and over RCAg. 


7. WKL. 
2. Every formally real field is orderable. 


We refer to Simpson [288, Chapter IV.4] for a proof, as well as for much more content 
about the reverse mathematics of formally real fields. 

Next we shift to groups. Recall that an abelian group (G,+G, 0g) is torsion free 
if for every nonzero a € G and every natural number n > 0, na # Og. (Here, na is 
defined inductively: Oa = a, and forn > 0,na = (n—1)a+ga.) A classical result of 
Levi [197] states that an abelian group is orderable if and only if it is torsion free. The 
effective content of this result was first considered in computable structure theory, 
by Downey and Kurtz [73]. They gave an explicit construction of a computable 
torsion free abelian group with no computable ordering. Thus, Levi’s theorem is 
not computably true. In turn, Downey and Kurtz asked whether an analogue of the 
Metakides and Nerode result above holds also in this setting. A close connection 
between m1 classes and orderings of torsion free abelian groups was provided by 
Hatziriakou and Simpson [140], as follows. 
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Theorem 11.4.3 (Hatziriakou and Simpson [140]). The following are equivalent 
under <, and over RCAo. 


7. WKL. 
2. Every torsion free abelian group is orderable. 
3. An abelian group is torsion free if and only if it is orderable. 


Proof. That every orderable abelian group is torsion free can be proved by induction 
in RCAg (see Exercise 11.7.8). Thus, it suffices to prove the equivalence of (1) 
with (2). 

To prove this, will make use of the following auxiliary notion. Suppose 
(G,+G, 0g) is an abelian group. A positive cone for G is a set P € G such that: 


* Gis closed under +g. 
¢ For all a € G, either a or —a belongs to G. 
¢ For all a € G, a and —a belong to G if and only if a = 0g. 


Then G is orderable if and only if it has a positive cone, and this fact is provable 
in RCA. (See Exercise 11.7.4.) In fact, if <g is an ordering of G then there is a 
positive cone P for G computes in G® <q; conversely, given a positive cone P for 
G, there is an ordering <g of G computable in G © P. 

We now proceed to the argument. We prove the equivalence over RCAg; as usual, 
the <, reduction is similar. 


(1) > (2): We argue in WKL. Let (G,+G, 0G) be an abelian group, say with domain 
{ap < a, < +--+} where rp = 0g. Let T C 2<N be the tree of all o with the following 
properties: 


° If |o| > O then 0 (0) = 0. 

¢ For all nonempty F < |o| and all z < |o| with az = Dep ax, if o(x) = 1 for 
all x € F then o(z) = 1. 

* For all x,y < |o| with a, +g ay =0g, 7(x) =1-o()). 


We claim that T is infinite, and hence is an instance of WKL. In fact, we proceed by 
induction to show the following stronger fact: for every s, there is ao € T of length 
s such that 0g ¢ ({ax : 7 (x) = 1}). This is trivial for s = 0. So fix s > 0 and assume 
the claim holds for s— 1, as witnessed by o0 € T. If 70 € T then we are done because 
for all x, o(x) = 1 if and only if 70(x) = 1. Otherwise, from the definition there are 
two possibilities: 


* Case 1: There isa nonempty F < s—1 such that }} <7 dx = ds—; and o(x) = 1 
for allx € F. 
* Case 2: There is an x* < s— 1 such that ay« + ds_) = 0g and a (x*) =0. 


In the first case we have as_; € ({a, : a(x) = 1}), so in particular 0g ¢ (fa, : 
o (x) = 1}). Thus, a,_; cannot equal 0g or —a, for any w < s—1 such that o(w) = 1. 
It follows that o1 € T. In the second case, we have only to check that —a;_; is not 
equal to )} <r 4x for some nonempty F < s— 1 with a(x) = 1 for all x € F. But if 
this were the case then we would have o-(x*) = 1, since o € T. Hence, again a1 ¢€ T, 
and the claim is proved. 
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The induction in the above argument is on a ny formula of s, and hence its 
verification goes through in RCAo. 

Let f € [T] be arbitrary and let P = {0} U {a, : f(x) = 1}. Then by definition 
of T, P is a positive cone for G. 


(2) — (1): We now argue in RCAg. Fix computable injections f,g: N — N with 
disjoint ranges. We construct a separating set for the ranges of f and g, thereby 
obtaining WKL. Let {p, : x € N} be an enumeration of the primes in increasing 
order. Let G be the abelian group generated by elements y, xo, x1, .. ., subject to the 
following relations for all i: 


P2jX (7) -G Y =e, 
P2j+1Xg(j) tg y = 0G. 


A typical element of G can be put in the form cy + ));<7 djx;, where c € N, the d; 
are positive integers, F’ is a (possibly empty) finite subset of N, and for all i ¢ F we 
have 

(Vi)[p2j < di > f(J) FEA pajet < di > g(j) Fi). (11.1) 


Since n+ 1 < py, for all n (easily verified in zt). (11.1) is equivalent to 
(Vj < di)[p2j < di > f(J) FIA prjsi < di > BC) Fi). 


Thus, by considering representatives in the above form and defining addition appro- 
priately, we can regard G as a group in the sense of Definition 11.1.1. Clearly, G is 
abelian. 

We claim that G is torsion free. To this end, fix a = cy + Vier dix; € G and 
suppose na = Og for some n > 0. For the y term to cancel with the x; terms, for 
every i € F there must be a j; such that either i = f(j;) and po;, | nd; or i = g(j;) 
and p2;,+1 | nd;. For eachi € F, let q; be p2;, if i = f(j;) and —po;,41 if i = g(ji). 
Thus, g;x; = y for alli € F. Also, by (11.1), we have d; < |q;|, hence q; { dj; since 
|q;| is prime, this means gq; | n. Putting these facts together, we obtain 


ndjx; = ndiq;'(qixi) = ndiq;'y. 


That g; | 1 is used to conclude that the rightmost term above is defined. The above 
yields that 
Og =na=ncy+ > nd;x; = (nc + > ndiq;')y, 
ier ieF 

and soc + Dicer diq;' must be equal to 0. Let r = [];er qi. Then for each i € F we 
have 

cr+ 3 djq;'r = diq;'r. (11.2) 

J¢eFN\{i} 

Since f and g are injective, the map i + |q;| is injective on F’. It follows that 
qi | qj'r for all 7 # i and therefore that qg; divides the left hand side of (11.2). But 
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by the same token, q; { at: Hence, g; must divide d;, which we already noted 
above is not the case. This proves the claim. 

Apply (2) to obtain a positive cone P for G. Note that if x; € P then by induction, 
nx; € P for all n. Conversely, if nx; €¢ P for some n > 0 then x; € P. This is because 
if x; ¢ P then —x; € P since P is a positive cone, and so also (nm — 1)(—x;) and 
nx; +G (n— 1)(—x;) belong to P. But the latter is equal to x; (again, by induction). So 
in particular, for all 7 we have xf ;) € P if and only if pajxf(j;) € P, and Xj) € P 
if and only if p2j41x ¢(;) € P. Also, forall j, pajx fj) = y and p2j+1X9(;7) = —y, and 
exactly one of y and —y belongs to P. Consequently, either x ¢(;) € P and xg(;) ¢ P 
for all j, or Xg(;) € P and xf ;) ¢ P for all j. Thus, if we let Z = {i : x; € P}, 
then we have either that range(f) C Z and Z MN range(g) = @, or range(g) € Z and 
Z Mrange(f) = @, meaning that Z is the desired separating set. Oo 


Further results about orderability of algebraic structures have been obtained by 
Solomon [296, 297] (see also Solomon [298]). For a group G, let Z(G) denote the 
center of G,ie., {a € G: (Vx € G)[a *G x = x *g al}. For each N <G, let 
my: G — G/N be the natural homeomorphism sending elements to cosets. Then 
in RCAop, we say a group G is nilpotent if there exists an n > O and a sequence 
(N; : i <n) of normal subgroups of G such that 


* No = {ec}, 
* for alli <n, Nj =ay_,Z(G/Ni), 
© n-1 = G. 


Using this definition, the following result can be obtained. 


Theorem 11.4.4 (Solomon [297]). The following are equivalent under <, and over 
RCAo. 


7. WKL. 
2. Every torsion free nilpotent group is orderable. 


In general, RCAg cannot prove for every group G that Z(G) exists (see Exer- 
cise 11.7.9). But if Z(G) does exist then RCAo can verify its basic properties, 
e.g., that Z(G) is a normal subgroup of G (see [297]). 

We can find orderability results also at the level of RCAg and ACAo. By replacing 
“linear order” with “partial order” in Definition 11.4.1, we obtain the notion of a 
partially orderable group. This, too, can be characterized in terms of positive cones, 
only with the condition that a cone contains every group element or its inverse 
removed. If <q is a partial ordering of G, a subgroup N <G is convex (with respect 
to this ordering) if N contains every x € G such thata <gG x <gG bforsomea,be N. 
If N is convex, then <G induces an ordering of the quotient group G/N. Namely, let P 
be the positive cone on G associated to <g; then {a € G/N : (Ab € N)[axgb € P]} 
is a positive cone on G/H. It can be checked that the associated ordering is linear if 
XG is. 


Theorem 11.4.5 (Solomon [297]). The principle “if G is a (linearly) orderable 
group and N is a convex normal subgroup of G, then the induced ordering on G/N 
exists” admits computable solutions and is provable in RCAo. 
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On the other hand, for partial orderings in general, the existence of the induced 
ordering on convex subgroups codes the jump. 


Theorem 11.4.6 (Solomon [297]). The following are equivalent under <_ and over 
RCAo. 


1. TJ. 
2. If G is a partially orderable group and N is a convex normal subgroup of G, 
then the induced ordering on G/N exists. 


Hence, (2) is equivalent over RCAg to ACAo. 


11.5 The Nielsen—Schreier theorem 


One theorem of algebra with especially interesting reverse mathematics behavior is 
the Nielsen—Schreier theorem, which asserts that every subgroup of a free group is 
free. It turns out that the strength of this theorem is affected by how, precisely, we 
choose to think of subgroups. As we will see, this consideration yields two versions 
of the Nielsen—Schreier theorem, one provable in RCAg and the other equivalent over 
RCAp to ACAo. We will state both of these results carefully in this section, and give 
a proof of the latter. 


Definition 11.5.1. Fix a set X € 2°. 


1. Wordy denotes the set (X x {-1, 1})<@; its elements are words (over X). 

2. A word w is reduced if for all i < |w| — 1, if w(i)(O) = wi + 1)(0) then 
w(i)(1) = w(i + 1)(1). The set of reduced words over X is denoted Redx. 

3. Two words wo and w, are 1-step equivalent (over X) if there is a b < 2 such that 
|wp| = |wi-p| + 2, and there is an i < |wy| — 1 as follows: 


° Wi-p(J) = wWp(J) for all j <i, 
° wi-p(J) = we(j +2) for all j with i < 7 < |wy-ol, 
* wp (i) (0) = wo (i + 1)(0) and wp (i) (1) = —we (i + 1)(1). 


4. Two words wo and w) are freely equivalent (over X) if there is ann > 2 anda 
sequence of words vp, ..., Un—1 Such that wo = vo, W1 = Un»_1, and for alli < n—-1, 
v; 1s equal or 1-step equivalent to v;41. 


Note that we are using w, v, u,.. . for words, instead of Greek letters as we ordinarily 
do for sequences; thus, e.g., wu indicates the concatenation of w by v, etc. Likewise, 
following customary (and more legible) notation, we may write an vee a for the 
word w with |w| = k and w(i) = (a;, €;) for all i < k. Thus, w is reduced if and 
only if it does not have the form ---a'a7!--- or---a7'a!--- for some a € A. The 
act of “deleting” some such occurrence of a!a~! or a~!a! corresponds to a 1-step 
equivalence, and doing this iteratively gives free equivalence. 

The above definition readily formalizes in RCAo, and RCAo suffices to prove 


various basic facts concerning it, most notably that every word w is freely equivalent 
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to a unique reduced word, which we denote by px(w). (See Exercise 11.7.6.) This 
facilitates the following definition, central to this section. 


Definition 11.5.2. Fix X € 2”. The free group on X, denoted Fx, is the structure 
(Redx, -x, lx) in the language of groups, where 1x is the empty string (as a sequence 
in Wordx) and for w,v € Wordx, w-x v = px(wv) (with wv = w~ v meaning the 
concatenation of w by », as strings). 


In RCAo, we can verify that Fy is indeed a group (Exercise 11.7.7). 
One remark, which will be important shortly, is that the map px naturally extends 
to a map px: Wordworay — Redx. 


Definition 11.5.3. 
1. For w € Wordx, let sw € Wordx to be the sequence of length |w| such that for 
alli < |w, 
aw(i)(0) = w(|w| — i - 1)(0), 
aw(i)(1) = -w(|w| -i- YC). 


That is, if w = a5°---a;*;' then sw = a, *'---a™. 
2. For w € Word, and « € {-1, 1}, let 
ife =1, 
sen(we)=4 
aw ife=-l 


3. For (wo, €9) ++ * (Wn-1, En-1) € Wordy, let 
Px ((Wo, £0) +++ (Wn=1, En-1)) = Px (SZN(Wo, £0) ++ SZN(Wr-1, En-1)). 
So for example, if we take X = {a, b} and apply px to the word 
(ab 'aa!)—(a' 1b })! 
over Wordx, we obtain 
px(ata'blatabp tb =a at. 
Using the notation and definitions above, we now give the definition of what it 


means (in our setting) for a subgroup of Fy to be free. 


Definition 11.5.4. Fix X € 2° and let H be a subgroup of Fx. Then A is free if 
there is a set B C H as follows: 


* For every h € H there isa w € Redg C Wordworay such that h = px(w). 
¢ For all wo, w; € Reda, if wo # w then px (wo) # px (w)). 


Thus, a subgroup H of Fy is free if it is generated by a subset of the elements of H 
among which there are no nontrivial relations (in H). This, of course, is the standard 
definition in algebra. 
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The Nielsen—Schreier theorem may at first glance seem trivial. But it is not, 
precisely because finding the “right” basis B in the above definition is nontrivial. 
(Indeed, a subgroup H of a free group Fy may also be generated by a set of elements 
that do satisfy nontrivial relations.) There is a quick topological proof of the theorem, 
which uses the fact that a group is free if and only if is the fundamental group of a 
graph, and that every subgroup of such a group is the fundamental group of a covering 
of this graph. This argument is difficult to formalize in second order arithmetic. But 
there is another, using so-called Schreier transversals, which is more direct and more 
constructive. (See Igusa [165] for both proofs in detail.) Building on the method, 
Downey, Hirschfeldt, Lempp, and Solomon [76] obtained the following. 


Theorem 11.5.5 (Downey, Hirschfeldt, Lempp, and Solomon [76]). RCAo proves 
that for every X € 2, every subgroup of Fx is free. 


We do not include the proof here, which would take us a somewhat afield. In broad 
outline, however, it follows the classical version. 

There is another way to talk about a subgroup which is often more convenient in 
algebra, and using a presentation rather than an explicit subset. The next definition 
shows how this can be accommodated in our setting. 


Definition 11.5.6. Fix X € 2°. 


1. For S € Fx, the subgroup presented by S is the structure (S) with domain 
{h é Fx : (dw € Words) [px (w) = h]}, 


with multiplication inherited from Fy. 
2. (S) is free if there is a set B C Fy as follows: 


* For every w € B there is a w € Words such that w = px(v). 
° For every / € (S) there is aw € Redg such that h = px(w). 
° For all wo, w; € Reda, if wo # w, then px (wo) # Px (w)). 


In what follows, when we say that (S) is free we do not mean necessarily that (S) 
exists. Indeed, RCAo cannot in general prove this existence, as we are about to see. But 
if (S) does exist then RCAg can prove it is a subgroup of Fy (Exercise 11.7.10). And 
it is not difficult to see that if (S) exists then it is free in the sense of Definition 11.5.4 
if and only if it is free in the sense of Definition 11.5.6. Now, classically, moving 
between S and (S$) is unproblematic and natural. But it has significance for the 
strength of the Nielsen—Schreier theorem. 


Theorem 11.5.7 (Downey, Hirschfeldt, Lempp, and Solomon [76])._ The follow- 
ing are equivalent under <, and over RCAo. 


1. TJ. 
2. For every X € 2® and every S © Fx, (S) exists. 
3. For every X € 2” and every S € Fx, (S) is free. 


Hence, (2) and (3) are equivalent over RCAg to ACAo. 
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Proof. We give the equivalence over RCAg. 


(1) — (2): Arguing in ACAg, fix X and S$ € Fx. As (S) is 2) -definable in S, it exists 
by arithmetical comprehension. 


(2) — (3): Arguing in RCAp, fix X and S C Fy. By (2), (S) exists and this is a 
subgroup of Fx. By Theorem 11.5.5, (S) is free in the sense of Definition 11.5.4 
and hence in the sense of Definition 11.5.6. 


(3) — (1): We argue in RCAo, assuming (3). Fix an injective function f: N — N; 
we prove that range(f) exists. Let X = {a, : n € N} be a set of infinitely many 
generators, and define 


S={a2:yeN}uU{a*! : f(x) =n}. 


Here, for z € Z, a} is the obvious shorthand, defined inductively for z > 1 by 
a? = a®'a! and for z < 1 by a? = a®*'a}. Apply (2) to find a set B C Fx 
witnessing that (S) is free. The proof is completed by the following two claims. 
Claim 1: For alln € N, n € range(f) if and only if ay € (S). For every n we have 
a> € S and hence a;,>* € (S) for every x € N. Now if n € range(f) then f(x) =n 
for some x, hence a}**! belongs to S and consequently an = Px (a2**'a;,**) belongs 
to (S). 

In the other direction, suppose n ¢ range(f). We will show there is no w € Words 
with a, = Px (w), implying that a, ¢ (S), as needed. 

For w,v € Fx, say v is an a, block in w if there exists ig < |w| as follows: 


v(i)(0) = ay for alli < |v], 

ig + |v| < |w| and v(Z) = w(ip +i) for all i < fo}, 
if ip > O then w(ig — 1)(0) # ay, 

if ig + |v| < |w| then w(ip + |v|)(O) # ay. 


Note that since v € Fy = Redx, the first clause implies that v(z)(1) is the same for 
all i < |v|. Thus, v is an a, block in w just if it has the form a},---a! or a;,!---a;!, 
and w has the form - ++ @)Udm, +++ for some mo, m, # n. 

We claim that for all w € Words and all v € Fx, if v is an a, block in py(w) 
then |v| is even. In particular, this means px (w) # Gn, as is to be shown. The proof 
is by induction on |w|. (Note that the statement to be proved is i) If |w| = 1 then 
w = (a2,)*! for some m, so any a, block in px (w) = a*? has length 0 or 2. Assume 
then that |w| > 1, and that we have already proved the result for all words over S 
of length |w| — 1. Then w = w*(aky* for some w* € Words with |w*| < |w| and 
some m, k € N. By inductive hypothesis, any a, block in 6x (w*) has even length. 
If m # n, then py(w*) and px(w) have the same a, blocks, so the claim holds. 
So suppose m = n. In this case, we necessarily have k = 2 since n ¢ range(/f) 
by assumption, and px (w) = p(p(w*)a*’). If p(w*), as a sequence, ends in an ay 
block v, then |v| is even so p(va*’)) is another a, block in px (w) of even length. 
Otherwise, p(p(w*)a*”) = p(w*)a*?, and the a, blocks in this word are those of 


p(w") as well as a*”, which has length two. 
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Claim 2: {n : an € (S)}, and hence range(f), exists. For every n, a2 € S C (S), 
hence by definition of B there is a v, € Redg with px (un) = ae. Moreover, this v is 
unique, so the map 7 +> v,, exists by pa comprehension. (See also Exercise 11.7.5.) 

We claim that for each n, a, € (S) if and only if there is a w € Redg satisfying 
the following: 


x px (w) =4n, 
° |wl < onl, 
* {w(i)(O) : i < |wl} € {un(j)(0) : 7 < lol}. 


Clearly, once this is proved then it follows by yi comprehension that {1 : a, € (S)} 
exists. Note that the last clause simply says that every word occurring in w as an 
element of Words also occurs in vy. 

Fix n. By definition, a, € (S) if and only if there exists a w € Redg satisfying the 
first clause above. Thus, to complete the proof it suffices to show that if px (w) = ay 
for some w then w necessarily satisfies the second and third clause above as well. 

Since v, € Redg, we have that o B(w) = vp. First, suppose w has odd length as 
an element of Redz, so that we can write it as bs tee ned for some k. Then 

w? = (BR BERBER! BIR) (BE «DEL BE... BE), 
which has length 2|w|. Since a cannot cancel with bo in the free reduction of 
w” to v,, the most that can happen is for b7\! --- b5?* to cancel with bj’ --- be, 
leaving the reduced word 


£0 oes ek &k ... 2k 
by bi. by by . 


Similarly, if w has even length as an element of Redg then we can write it as 


be °... by? for some k, in which case the most that can happen in the reduction of 


w- = Cs te De peel. Seas ( woe DEK Dokl... pS2k-1 


k+l 2k-1 k-2 PK 2k-1)? 
to v, is for b2**! --- b£2*" to cancel with b*° .-- b@*~. This is because b** cannot 
n kel 2k-1 0 k-2° k 


cancel with Pegs as these also appear next to each other in w, which is already 


reduced over B. In this case, we are thus left with the reduced word 


b& oon e be ag Lie bk! 


0 2k-1° 
Either way, we have that |v,| = |og(w)| > |w|, and every word occurring in w 
occurs also in v,. This completes the proof. oO 


We conclude this section with a comment on representation. To algebraists, being 
a free group is a property up to isomorphism. Thus, just as being a cyclic group 
means to have a generator, and not necessarily to be Z/nZ for some n, so too being 
a free group means more than just being Fy for some X. It is important to notice 
that the above treatment accommodates this view, even if the terminology does not. 
Namely, Definition 11.5.2 does not define a group to be free if it is isomorphic 
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to some Fy. But the proofs of Theorems 11.5.5 and 11.5.7 would go through just 
the same for such groups. This is because in a model of RCA, if a group G is 
asserted to be isomorphic to Fy then the isomorphism must exist (in the model). 
So for instance, if we have a subgroup H of G that we wish to show is free, we 
first apply the isomorphism to H to get a subgroup H of Fx (which exists, since the 
isomorphism is surjective), and then apply the relevant theorem to conclude Fy is 
free, and hence that H is free on account of being isomorphic to H. 


11.6 Other topics 


Many other corners of algebra have been explored in reverse mathematics. In this 
section, we take a look at a smattering of these results. This survey is incomplete 
but illustrates several interesting theorems. We will omit the proofs, some of which 
are quite involved. Many of the reverse mathematics tools they use are implicit in 
the proofs we have seen already, for example, coding with polynomials over Q. In 
addition to these methods, a deep understanding of the underlying algebraic results 
is required. 

Algebra principles turn up at all levels of the “big five” hierarchy. We have seen 
principles at the level of WKLg and ACAg above. An interesting principle at the level 
of RCApg is the following version of Hilbert’s Nullstellensatz. 


Definition 11.6.1 (Sakamoto and Tanaka [266]). Fix n,m > 1. NSS,» is the 
following statement: for all po,...,Dm-1 € C[xo0,---,Xn-1] having no common 
zeroes there exists go,..-,;4m-1 € C[x0,.--,Xn—-1] such that }};.,, pigi = 1. 


Theorem 11.6.2 (Sakamoto and Tanaka [266]). RCAo proves (¥m)(Wn)NSSp im. 


The main difficulty here is that NSS,, ,, speaks about real (and complex) polynomials, 
and RCAg only understands the reals as quickly converging Cauchy sequences (hence, 
as sets). It can thus be quite complex to write down even basic statements about the 
reals in the language Lo. 

What is needed, therefore, is a satisfaction predicate for sentences about the 
reals that better lends itself to manipulating the reals in RCAp, and in particular, 
suffices to prove that the real numbers satisfy the usual axioms of real closed fields. 
The latter includes the axiom that every polynomial of odd degree has a root. For 
polynomials of standard length, a suitable predicate was developed by Simpson 
and Tanaka [290] and Tanaka and Yamazaki [307]. As pointed out by Sakamoto and 
Tanaka [266], this is enough to prove (Vm)NSS,, m in RCAo for every standard n. The 
key innovation in [266] is the development of a stronger satisfaction predicate, and 
an accompanying soundness theorem for RCAo, that pushes the argument through 
also for (¥n)(Wm)NSSp m- 

Several results related to Ulm’s theorem characterizing countable abelian p- 
groups lie at the level of ATRo. For a group G, let G denote @,., G, i.e., the direct 
sum of countably many copies of G. 


1€w 
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Theorem 11.6.3 (Friedman [111]). The following are equivalent over RCAo. 


1. ATRo. 

2. For all p-groups Go and G, either Go is embeddable into en or G, is embed- 
dable into Go. 

3. For all p-groups G and H, there is a direct summand K of G and H such that 
every direct summand of G and H is embeddable into K. 

4. For all p-groups G and H, there is a direct summand J of G and H such that 
every direct summand of G and H is a direct summand of J. 

5. For every sequence (Go, G1,...) of p-groups, there exist i < j such that G; is 
embeddable in G ;. 

6. For every sequence (Go, G1,...) of p-groups with Go > G, 2 -:-, there exist 
i < j such that G; is embeddable in Gj. 


In Chapter 12, we will see another equivalence to ATRo, similar in shape to (2): given 
two countable well orders, one necessarily embeds into the other. As observed by 
Greenberg and Montalban [129], both of these results can be obtained by iterating 
a certain kind of derivative operator along a countable well order. In fact, the au- 
thors present a general framework for results of this kind, including several further 
examples. 

Recall that an abelian group G is divisible if for every a € G and every n > 0 
there is ab € G such that nb = a, and that G is reduced if its only divisible subgroup 
is the trivial group, {Og}. A well-known result of algebra states that every abelian 
group is a direct sum of a divisible and a reduced group. This provides an example 
of a theorem at the level of TI}-CAo. 


Theorem 11.6.4 (Friedman, Simpson, and Smith [117], after Feferman [103]). 
The following are equivalent over RCAo. 


1. TH!-CAo. 
1 
2. For every abelian group G there exists a divisible group D and a reduced group 
R such thatG =D®R. 


Our discussion thus far has exposed a conspicuous lack of examples of natural 
theorems of algebra that would fall outside the “big five”. As discussed in Section 1.4, 
experience has shown that the primary objects of study in algebra are of sufficiently 
rich structure to facilitate coding, pushing most theorems not provable in RCAg 
into one of the other four main subsystems. But this is an empirical observation 
only, and Conidis [52] (see also Montalban [220]) has initiated a program to find 
a counterexample: that is, a theorem known to algebraists (the miniaturization of) 
which is neither provable in RCAo nor equivalent to any of WKLo, ACAg, ATRo, or 
IIj-CA. 

An early contender for such a theorem was the classical result that every Artinian 
ring is Noetherian. Recall that a ring R is Artinian if it has no infinite strictly 
descending sequence of ideals; R is Noetherian if it has no infinite strictly ascending 
sequence of left ideals. Conidis [52] showed that the principle that every Artinian 
ring is Noetherian follows from ACAg and implies WKLp over RCA, leaving open 
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the tantalizing possibility of it lying strictly in between. In subsequent work, this was 
shown not to be the case. 


Theorem 11.6.5 (Conidis [53]). The following are equivalent over RCAo. 


1. WKLo. 

2. Every Artinian ring is Noetherian. 

3. Every local Artinian ring is Noetherian. 

4. Every Artinian ring is a finite direct product of local Artinian rings. 
5. The Jacobson radical of an Artinian ring exists and is nilpotent. 


Here we recall that a ring is /ocal if it has a unique maximal ideal, and the Jacobson 
ideal of a (not necessarily commutative) ring is the intersection of all its maximal 
ideals. 

The implication from (1) to (2) in the theorem is especially noteworthy. Indeed, 
the standard proof of (2) only goes through in ACAg, so Conidis gives, in particular, 
a new proof of this well-known algebraic theorem. This is reminiscent of the proof 
in WKLo of the fact that every commutative ring has a prime ideal (Theorem 11.3.1), 
which likewise proceeded in a very different way from the standard one. Of course, 
we have also seen aspects of this in Chapter 8 with Ramsey’s theorem. The takeaway 
is that sometimes the system that can naturally formalize a classical proof coincides 
with the weakest system in which the theorem can be proved, but not always, and in 
those cases a separate, more refined argument is needed in the weaker system. In the 
case of (2) above, this necessitated the use of some fairly novel algebraic techniques. 

Remarkably, the intuition that the theory of Noetherian rings might yield an 
example of an algebraic result provably distinct from each of the “big five” turns out 
to be correct. In recent work, Conidis [54] has identified such a principle: 


Definition 11.6.6 (Conidis [54]). NFP is the following statement: every Noetherian 
ring has only finitely many prime ideals. 


This is in turn closely related to a purely combinatorial principle. 


Definition 11.6.7 (Conidis [54]). 


1. A tree T € 2< is completely branching if for all o € T, if oi € T for some 
i<2thenalsog~ (1-1) €T. 

2. TAC is the following statement: every >°-definable infinite completely branching 
tree has an infinite antichain. 


Conidis (see [54], Remark 2.4) gives a simple proof of TAC in RCAg + CAC. Fix a 
>°-definable infinite completely branching T ¢ 2<“. By Theorem 6.1.6, there is an 
injective function f: N — 2<“ with range(f) = T. (As usual, we write this merely 
as shorthand; we do not mean to assert that T exists.) We can use f to define a 
partial order <p on N, namely x <p y if and only if f(x) < f(y). By CAC, (N, <p) 
admits an infinite chain or antichain, S. If S is an antichain, then we are done; so 
suppose S is a chain. Say S = {xp < x1 < ---}, so that f(xo) < f(x1) <---. For 
each i, let b; = f (xi+1)(|f(xi)|), so that f(x) < f(x;)b; < f(xi41). But since T is 
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completely branching, we must also have f(x;) ~(1 — b;) € range(f). Now the set 
{ f(x) ~(1 — b;) : i € N} exists and is an infinite antichain in T. 
Conidis [54] also established the following far less straightforward results. 


Theorem 11.6.8 (Conidis [54]). 


I. RCAg + WKL + CAC — NFP — TAC. 
2. RCAg + 2-RAN — TAC — OPT. 


Since WKLo has an w-model consisting entirely of sets of hyperimmune free degree 
(Exercise 4.8.13), we immediately obtain the following: 


Corollary 11.6.9. NFP <,, WKL. 


On the other hand, by the low basis theorem and Corollary 8.5.9 there is also an 
w-model of WKL + CAC (indeed, of WKL + RTS) consisting entirely of low sets. In 
particular, such a model does not contain 9’. 


Corollary 11.6.10. TJ <., NFP. 


We conclude that WKLp * NFP and RCAy ¥ NFP — ACAp. In particular, NFP is not 
provable in RCAg, nor equivalent over RCAg to any of the other four main subsystems 
of 2. 

Conidis (personal communication) has conjectured that NFP implies WKLo over 
RCAo. As of this writing, this remains open. 


Question 11.6.11 (Conidis). Does RCAj + NFP — WKL? 


A positive answer would give another example (along with the principle AP of the 
previous chapter) of a natural theorem to lie strictly between ACAy and WKLo. 

The principle TAC is of independent interest, especially alongside the various 
combinatorial principles discussed in Chapter 9. The bounds of Theorem 11.6.8 (2) 
match those established for RRTS in Theorems 9.4.8 and 9.4.15. However, we know 
RRT3 implies DNR (Theorem 9.4.13), whereas CAC does not (Theorem 9.2.21). 
Since CAC implies TAC, as we saw above, it follows that TAC cannot imply RRT3. 
However, the reverse implication remains open: 


Question 11.6.12 (Conidis [54]). Does RCAg + RRT; — TAC? 


11.7 Exercises 


Exercise 11.7.1. Prove the following in RCAo. Let V be a vector space over a field K 
and let B be a basis for V. Prove that for every vector v € V there is a unique finite set 
F C B such that for each b € B there is a scalar k, # Ox satisfying v = Viner kpb. 


Exercise 11.7.2. Prove the following in RCAo. If G is an abelian group and a, 8 € 
G*<N have the same range then [jeg a = Taeg 4. 
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Exercise 11.7.3 (Friedman, Simpson, and Smith [117]). Prove the following in 
RCAo. Let K be a field. 


1. For all p,q € K[x] there exist d,r € K[x] such that g = pd+r and deg(r) < 
deg(p). 

2. For all p,q € K[x] with p # q and g # 0 there exist p*,qg* € K[x] such that 
pq = p*qand forall r,s,t € K[x],ifrs = p andrt =q thenr = 1. 


Exercise 11.7.4. Show that an abelian group is orderable if and only if it has a 
positive cone. 


Exercise 11.7.5. Working in RCAg, fix X € 2N and show that Redy exists, and so 
does the set of all pairs (w,v) € Wordx x Word x such that w is freely equivalent 
to v. 


Exercise 11.7.6 (Downey, Hirschfeldt, Lempp, and Solomon [76]). Working in 
RCAg, fix X € 2N and define a function px on Word, recursively as follows: 


* px (1x) = 1x, 
* px(a*) = a® for every a € Aande € {-1, 1}, 


* given w € Wordy with px(w) = a}°---a7*}' € Wordx, and given a € A and 


k-1 
e € {-1, 1}, 
ly ifk=1Aag_1 =aA €-1 = —€, 
px (wa*) = eta Pe ifk>1Aag_-| =AA Ex-| = 6, 
€0 Ek-1 ,€ PIT] 
a a,-,;a* otherwise. 


1. Prove that px (w) € Redx for all w € Wordx. 

2. Prove that if w € Redx then px (w) = w. 

3. Prove that w is freely equivalent to px (w) for all w € Wordx. 

4. Prove that if w, v € Wordy are freely equivalent then px (w) = px(v). 

5. Conclude that every word w over A is freely equivalent to a unique reduced 
word, and this is px (w). 


Exercise 11.7.7 (Downey, Hirschfeldt, Lempp, and Solomon [76]). Prove in 
RCAg that for every X CN, the free group on X is a group. 


Exercise 11.7.8 (Hatziriakou and Simpson [140]). Prove in RCApo that every 
orderable abelian group is torsion free. 


Exercise 11.7.9 (Solomon [297]).. Working in RCAg, fix an injective function 
f:N--N. Let G be the free group generated by {x; : i € N} U {y; : i € N} subject 
to the following relations for all i, /: 


°XjXj7 =XjXi, 
ViVi = YViVi> 
© yixy =ayyi O (WE < ALF #7). 
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1. Prove by induction on the length of words that every element of G is freely 
equivalent to a unique word of the form 


€0 En-1 ,,60 Aer Om-1 
Xin in-1 Jo Yjm-1 zs 
where ig < +++ < in-1, fo < +++ < jm-1, &i # 0 and 6; # 0 for alli < n and 


j <m, and z is a product of nontrivial commutators of elements of G. (Recall 
that the commutator of w), w2 € Gis [w1, w2] = w, 'wz!wiw2.) 

2. Conclude that G exists as a group in the sense of Definition 11.1.1. 

3. Show that range(f) is = definable from Z(G). 

4. Conclude that the statement that for every group G, the center Z(G) exists, is 
equivalent to ACAo. 


Exercise 11.7.10. Working in RCAo, fix X € 2° and S C Fy. Verify that if (S) 
exists then it is a subgroup of Fy. 


Exercise 11.7.11 (Conidis [54]). Let T ¢ 2<° be an infinite completely branching 
tree. A sequence (7, : 5 € w) is a enumeration of T if the following properties hold. 


1. To = {{)}. 

2. T; is a finite subset of 7, for all s. 

3. For each s there is a unique o0 € T, such that 00,01 € Ts41 \ Ts. 
4.T =U, Ts. 


Prove in RCAg that there is an enumeration for every >°-definable infinite completely 
branching tree. 


Chapter 12 ® | 


Check for 


Set theory and beyond ‘pate 


Simpson [288, p. 176] famously remarked that “ATRo is the weakest set of axioms 
which permits the development of a decent theory of countable ordinals”. We cannot 
easily talk about ordinals (equivalence classes of well orderings) as such in Z, but 
many properties of the ordinals can be formulated in terms of specific well orderings 
instead. We have already seen that ATRo can express many such properties quite 
naturally. In this chapter, we investigate additional properties of countable ordinals. 
We will also look at several other topics from set theory, including basic results 
concerning Borel sets and determinacy. 

Our account here presupposes some familiarity with these topics and the basic 
background surrounding them. The reader who has not seen this material before may 
wish to consult a set theory text first, e.g., Jech [166] or Kunen [191]. 


12.1 Well orderings and ordinals 


It will be convenient to lay out some terminology and notation that we will use 
throughout this section and the next. 


Definition 12.1.1. Let X and Y be linear orderings. 


1. A function f: X — Y is an embedding if it is injective and x <y y — f(x) <y 
f(y) for all x,y € X. 

2. Anembedding f: X — Y is a strong embedding if range(Y) is an initial segment 
of Y under <y. 

3. X embeds into Y, written X = Y, if there is an embedding f: X — Y. If 
range(f) C Y!<v! for some y € Y then we write X © Y. 

4. X strongly embeds into Y, written X =s Y, if there is a strong embedding 
fix -Y. 


The assertion that f is an embedding (or strong embedding) of X into Y is arith- 
metical in X and Y. RCAg can easily verify that a composition of embeddings is an 
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embedding, that a composition of strong embeddings is a strong embedding, and 
other basic properties. 


Definition 12.1.2. If S is a nonempty subset of N and (X,, : n € S) is a sequence of 
linear orderings, then >),,<5 X;, denotes the partial ordering U with domain U,<5 XnX 
{n} and linear ordering <y defined by (a,n) <y (b,m) if and only if n < m or 
n=manda <x, b. 


If X and ¥ are linear orderings we write X + Y for )),¢40,1} Xn, where Xo = X and 
X, = Y. We also write X + 1 for the linear dune {0,1} Xn where Xp = X and X is the 
linear ordering with domain {0} and <x,= {(0,0)}. 

Clearly, Unes Xn X {n} is always linear ordering. We also have the following, 
which is a straightforward consequence of the definitions. 


Lemma 12.1.3. The following is provable in RCAo. 


1. If Sis anonempty subset of N, and (Xy : n € S) is a sequence of well orderings, 
then Yincs Xn is a well ordering. 

2. If Xo, X1, and Z are well orderings with X, # @, andifY = Xyo+X, andY = Z, 
then Xy — Z. 


12.1.1 The x separation principle 


Our starting point is to prove Theorem 5.8.20, which states that the following state- 
ments are equivalent over RCAo. 


1. ATRo. 
2. The xt separation principle (Definition 5.8.19). 


Recall that, if T C w<@ is an infinite tree, then the rank of an element a € T is 
defined to be 


rkr (a) = sup{rkr (8) +1: BETAB> a}, 


and the rank of T is then defined to be rk(7) = rk (()). This definition is difficult to 
formalize in RCAp (or even ACAg). Since the rank of a well founded tree can be an 
infinite ordinal, that definition seems to require at least ATRo to state and work with. 
So, instead, we make do with the following. 


Definition 12.1.4 (Rank of a tree). The following definition is made in RCAp. Fix 
a well founded tree T C N<N and a well ordering X. We write rk(T) < X if there is 
a function f: T — X such that (Va, 8 € T)[a < t > f(a) >x f(r)]. 


The following says that ATRo basically understands this to be the “correct” definition 
of rank. 


Lemma 12.1.5. The following is provable in ATRo. Fix a well founded tree T ¢ N<“ 
and a well ordering X such that rk(T) < X. Then there is a function f : T — X such 
that for alla € T, 
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f(a) = sup{Sx(f(B)) : BET AB > a}, 
where Sx (x) denotes the successor x € X under <x. 


The proof is left to Exercise 12.5.1. Another important property of Definition 12.1.4 
is the following. 


Lemma 12.1.6. The following is provable in RCAo. Fix a well founded treeT C N<N. 
Then rk(T) < KB(T). 


Proof. Recall from Proposition 5.8.8 that WF(T) implies WO(KB(T)). By the 
definition of the Kleene-Brouwer ordering, the identity map T — KB(T) witnesses 
that rk(T) < KB(T). o 


The next lemma we will need is the following, showing essentially that ATRo 
suffices to prove the existence of initial segments of Kleene’s O. 


Lemma 12.1.7. The following is provable in ATRo. Fix X such that WO(X). For 
every set S, the set 


0%. ={eeN: 0 codes a linear ordering Ye = X} 


exists. 


Proof. In this proof, given a linear order Y and y € Y, write Y | y in place of Y!<¥¥! 
for ease of notation. ATRo can prove the existence of the set Z of all e € N such that 
®® codes a linear ordering Y, and 


(Vy € ¥e)(Ax € X)[¥e by = X Pal. 


(See Exercise 12.5.2.) We claim that Z = Oe. 
Ife ¢€ Oe. we may fix an embedding f: Ye — X. Then for each y € Ye, 
Ye ty = X } f(y). Thus, e € Z. (This is provable even in RCAo.) 

Conversely, fix e € Z. Then Y, must be a well ordering. For suppose we had 
yo >y, y1 >y, **+ for some yo, y1,... € Ye. Yet f: Ye f yo — X be an embedding. 
Then f(y1) >x f(y2) >x --: would be an infinite descending sequence in X, 
contradicting that X is a well ordering. So, we may apply arithmetic transfinite 
recursion along Y,. In particular, for each y € ¥,, we can define f,: Y. fy — X as 
follows: if y = Oy, then f, = ©; if y is a limit under <y, then f, = ees Fe3t 
y is the successor under <y, of w € Y, then fy = fy U {(w,m)} for the <x-least 
m € X \ range fy, or @ if no such m exists. 

We claim that f, # @ for all y >y, Oy. If not, then by arithmetical transfinite 
induction, we may fix the <y,-least such y. By construction, y must be a successor 
under <y,, say of w € Y,. Our supposition thus implies that range f,, = X, so fy, is an 
isomorphism. Fix any embedding h: Y, | y — X. Then g = i o his an embedding 
Y. | y — Ye [ z, which is impossible (see Exercise 12.5.3). This proves the claim. 

By induction along Y,, we find that each f, is a strong embedding. Letting 
f = Uyey, fy we thus obtain an embedding of Y, into X. We conclude that e € O%., 
as was to be shown. Oo 
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The final ingredient we will need to prove Theorem 5.8.20 is the following 
construction of a so-called double descent tree. 


Definition 12.1.8. If X, Y are linear orderings, then X « Y is the set containing () and 
all finite strings of the form 


((Xo5 Yo). ca) (Xn-1,Yn-1)) € io, 
where n > 0, X09 <x +++ <x Xp-1, and yo <y «++ <y Yn-t. 


The basic properties of this definition, listed below, are easy to verify. They are also 
straightforward to formalize in RCAg. 


Lemma 12.1.9. Let X and Y be linear orderings.. 


I. X *Y is a tree. 

2. Ifa WF(X * Y) then 7 WO(X) and a WO(Y). 

3. Ifa WO(X) thenY = X *Y. 

4, If WO(X) and WO(Y) and X = Y, then X = X *Y. 


Since X « Y and Y * X are clearly isomorphic, all occurrences of X * Y above could 
be replaced by Y « X. With reference to (4), note that X need not embed into X « Y 
if it does not embed into Y. 

We are ready to assemble the pieces to prove the theorem. 


Proof (of Theorem 5.8.20). (1) — (2): We argue in ATRo. Let v(x) and w(x) be x 
formulas such that (Vx)[y(x) — —y(x)]. We must prove that there is a separating 
set for y and w, meaning a set Z such that (Vx)[y(x) — x € Z] and (Vx)[W(x) > 
x ¢é Xj. 

By Corollary 5.8.4, there exist sequences of trees (7, : n € N) and (S, :n € N) 
such that for all n € N, 


and 
w(n) @ > WFE(Sz). 


Thus, our assumption that y and w are disjoint means that for all n, 
WF(T,,) V WF(S)). (12.1) 


By (12.1) and Lemma 12.1.9 (2), it follows that KB(7;,) * KB(S,,) is well founded for 
all n. And by Lemma 12.1.6 and Lemma 12.1.9 (3), if y(n) holds then necessarily 
rk(S,) < KB(T,) * KB(S,). 

Let U = 9), KB(T,,) * KB(S,,). By Lemma 12.1.3, WO(U). Clearly, KB(T7,,) * 
KB(S,,) <= U for every n. Hence, by the discussion above, if y(n) holds then 
rk(S,;,) < U. On the other hand, if y(n) holds then — WF(S,,), so we cannot have 
rk(S,;,) < U (orrk(S,) < X, for any well ordering X). Thus, Z = {n: S, < Z} isa 
separating set for y and w. And Z exists by applying Lemma 12.1.7 to S = (S, :n € 
N) and noting that X = O%. 
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(2) — (1): First, notice that RCA together with the xt separation principle implies 
sy comprehension. Indeed, given any = formula y(x) both it and w(x) = 7y(x) 
are a and a separating set for y and w is precisely {x € N : y(x)}. Hence, in the 
rest of the proof we may argue over ACAo. 

Fix X and L and assume WO(L). We must show that (3Z)[Z = X)]. For a set 
Y andne L,letY!"] ={y EN: (n,y) € Y} and Yl[<e7l = Umern y']_ Notice that 
for all sets Y and V, if 


yl] = XA (Vn <, x)[n >, 0, GYM! = yl<enl’] 


and 
vu] = ¥ A (Vn <_ x)[n >z 0, 2 Vl = yl<el’], 


then Y!<e*] = yl<z+l, This is proved easily by arithmetical transfinite induction 
along L. (Recall that, by Exercise 5.13.19, ACAg proves arithmetical transfinite 
induction.) 

Define v(x, y) and W(x, y) to be the 2 formulas 


(ay)[Y] =X A (Vn <p x)[n >, 0, 9 YM! syle] nye yl<e*17] 
and 
(ay)[Yl =X A (Vn <p x)[n >7 0, 9 YM syle] ay eylse!'], 


respectively. Then for all x,y, we must have =y(x, y) V =W(x, y). For suppose 
v(x, y) and w(x, y) were both true, as witnessed by Y and V, respectively. Then by 
the observation above, we would have Y!<e*! = vl<z*] and hence also Y/<4*!’ = 
vi<-*]’, But then this set would both contain and not contain y, a contradiction. 
We may consequently apply the xi separation principle to obtain a set Z such 
that for all x, y, if g(x, y) then (x, y) € Z, and if w(x, y) then (x, y) ¢ Z. Another 
arithmetical transfinite induction along L now establishes that for all n € L, Z'"! = 
Zi<in)’, Hence, Z = xX), which is what we wanted. oO 


12.1.2 Comparability of well orderings 


We now turn to what is arguably one of the most fundamental properties of ordinals: 
any two ordinals are comparable, meaning one is an initial segment of the other. 
Using Definition 12.1.1, we can formalize this in two ways. 


Definition 12.1.10. 


1. The weak comparability of well orderings (WCWO) is the statement that for all 
X,Y, if WO(X) and WO(Y), then either X¥ = YorY = X. 

2. The strong comparability of well orderings (SCWO) is the statement that for all 
X,Y, if WO(X) and WO(Y), then either X =; Y or Y &, X. 
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In this section, we will show that these principles are equivalent to each other and 
to ATRo. We will need the following two preliminary results. 


Proposition 12.1.11 (Friedman; see Hirst and Friedman [114]). RCAg+WCWO + 
ACAo. 


Proof. We argue in RCAo. Fix an injective function f: N — N. Let X be the set of 
all triples (y,x, z) such that either x = z = 0, or f(x) = y and z < x. Let <x be the 
lexicographic ordering of X. Using 1°, we can verify that WO(X). Let Y=N+1. 
For ease of notation, we may identify the largest element of Y under <y with 0, 
and identify {i € Y : i <y 0} under <y with the positive integers under their usual 
ordering. We clearly have WO(Y). 

By WCWO, either X = Y or Y = X. We claim the latter is impossible. For 
suppose otherwise, and fix an embedding g: Y — X. For eachi € N, write g(i) = 
(yi,%;, 2). Since g(1) <x g(2) <x -:: <x g(O) and <y is the lexicographic 
order, we must have y; < I. < --- < yo. By bounded x? comprehension, the set 
F = {y; : i > 0} exists and so is finite. Choose ani > O so that y; = max F. Then 
y; = yi for all j > i. If x; = 0 for all j > i then we must also have z; = 0 by 
definition of X, which contradicts the fact that g(i) <x g(i+1) <x -:-. 

So, without loss of generality, suppose x; > 0. Then x; > x; > 0 for all j > i, 
hence f(x;) = y; = y; by definition of X, so actually x; = x; since f is injective. 
This means that z; < x; =x; for all j > 1. But then we cannot have z; < zj4; <-:- 
by Proposition 6.2.7, so again we obtain a contradiction to the fact that g(Z) <x 
g(itl) <x--:. 

We conclude that X = Y, say viah: X — Y. Nowif f(x) = y then (y,x, z) <x 
(y + 1,0,0) for all z < x. Thus 


h(<y,x,0)) < +++ < A({y,x,x)) < h({y +1,0,0)). 


In particular, x < h({y + 1,0,0)). So we see that 


y € range(f) — (Ax < A({y + 1,0,0)))[F(x) = y], 
and therefore range(/f) exists. Oo 


We leave the proof of the second lemma to the next subsection. Recall the be 
axiom of choice introduced in Chapter 10, which is the scheme containing 


(Vn) (AX) (n, X) > (AY)(Wn)w(n, ¥")) 


for all = formulas w. 


Proposition 12.1.12 (Friedman; see Hirst and Friedman [114]).. RCAp +WCWO 
proves the par axiom of choice. 


The proof is combinatorially involved, but does not feature any elements we have 

not already encountered. We will see that the xi axiom of choice is strictly weaker 

than ATRo in Corollary 12.1.14 and Proposition 12.1.15 at the end of this section. 
We can now prove the comparability theorem. 
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Theorem 12.1.13 (Friedman; see Hirst and Friedman [114]). The following 
statements are equivalent over RCAo. 


1. ATRo. 
2. SCWO. 
3. WCWO. 


Proof. (1) — (2): We argue in ATRo. Fix X and Y such that WO(X) and WO(Y). 
The proof is similar to that of Lemma 12.1.7, only this time we do not know which 
of X or Y should embed into the other. We write X [ x and Y [ y for initial segments 
under <x and <y, as in that proof. By arithmetical transfinite recursion along Y, 
define fy: ¥Y fy — X as follows: if y = Oy then f, = ©; if y is a limit under 
<y then fy = Uz<,y fz; and if y is the successor under <x of some w € Y then 
fy = fw U {(w, m)} for the <y-least m € X \ range fy, or © if no such m exists. We 
consider two cases. 


Case I: fy # @forally >y Oy. By induction along Y, each fy is a strong embedding. 
Letting f = Uyey fy yields an embedding of Y into X. 


Case 2: fy = © for some y >y Oy. By arithmetical transfinite induction, fix the 
least such y. By construction, y must be a successor under <y, say of w € Y. By 
hypothesis, range f,, = X, so f is an isomorphism. Then f,,! is an embedding of X 
into Y. 


(2) > (3): Clear. 


(3) — (1): Assume WCWO. By Proposition 12.1.11, we may argue over ACAo. By 
Theorem 5.8.20, it suffices to prove the xt separation principle. Let y(x) and w(x) 
be xt formulas such that for all x, ay(x) V aW(x). By Corollary 5.8.4, there exist 
sequences of trees (7, : n € N) and (S, : n € N) such that for alln € N, 


y(n) 4 WF(T,) 


and 


We claim that for all n, either KB(7,,) or KB(S,,) embeds into KB(7,,) * KB(S,). 
If y(n) or &(n) holds, then this follows by Lemma 12.1.9 (3) just as in the proof of 
Theorem 5.8.20 above. If neither y(n) nor y(n) holds, then KB(T7,,) and KB(T,,) are 
both well orderings. Thus one embeds into the other by WCWO, and hence at least 
one embeds into KB(T7,,) * KB(S,,) by Lemma 12.1.9 (4). 

Let O(n, f) be the formula asserting that f is an embedding from KB(T,,) or 
KB(S,,) into KB(T,,)*KB(S,,). Note that 6 is arithmetical, hence ae By the argument 
just given we have (Vn)(Af)@(n, f). By Proposition 12.1.12 we may apply the xt 
axiom of choice. This yields a sequence (f;,, : n € N) such that 6(n, f,,) holds for 
every n. Let Z be the set of all such that f,, is an embedding from KB(S,,) into 
KB(T,,) * KB(S,). Then Z exists by arithmetical comprehension, and we claim it 
is a separating set for y and wW. Indeed, if y(n) holds then — WF(T,,), so there is 
no embedding from KB(T,,) into (the well founded tree) KB(T,,) « KB(S,,). Thus, 
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n € Z. Similarly, if w(n) holds then there is no embedding from KB(S,) into 
KB(T,,) * KB(S,,), and hence n ¢ Z. Oo 


One consequence is the following, which follows immediately from Proposi- 
tion 12.1.12 and Theorem 12.1.13. 


Corollary 12.1.14. ATRo proves the xt axiom of choice. 


Of course, we have yet to prove Proposition 12.1.12, and we do this in the next 
section. For completeness, we point out that the above corollary cannot be improved 
to an equivalence. Recall the w-model HYP from Chapter 5, and the fact that ATRo 
fails in this model (Proposition 5.8.21). 


Proposition 12.1.15 (Kreisel [186]). The w-model HYP satisfies the ye axiom of 
choice. Hence, the xt axiom of choice does not imply ATRo over RCAo. 


Proof. Kreisel [186] showed that if g(x, y) is a i, formula then 
(Vx) (Ay) g(x, y) > (Ah € HYP) (Vx) g(x, h(x)). (12.2) 


See Sacks [264, Lemma II.2.6] for a proof. Another standard fact of hyperarithmetical 
theory is that for every b in Kleene’s O, the predicates x € Hy, and x ¢ Hy are both 
uniformly iF in b (see [264, Lemma II.1.3]). 

Now consider any x formula y(n, X). By Kleene’s normal form theorem (The- 
orem 5.8.2), we can write this as (Af € NN)(Vk)0(X | k, f [ k,n) for some arith- 
metical formula 9. Suppose 


HYP & (Vn)(AX)¢(n, X). (12.3) 


Then, for each n, there is a hyperarithmetical set X and hyperarithmetical function 
f such that (Vk)O(X [ k, f } k,n). Then (12.3) can be rewritten as 


(Wn) (Ab, e0, €1) p(B, €0, €1,7), 
where is the formula 


DEON BI €2° NOE Ew A (WK)O(@L? Pk, OL? Pky n). 
By the remarks above, ¢ is te So, by Kreisel’s theorem, we may fix h satisfying 
(Vn) [h(n) = (b, e0, e1) AQ(D, e0, e1, 1). Since h € HYP, also (X;, : n € w) € HYP, 
where X,, = ole for the b and eg so that h(n) = (b, eo, e,). Now, by construction 
we have y(n, X,) for all n. Since y was arbitrary, we conclude that HYP satisfies the 
ar axiom of choice. oO 
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12.1.3 Proof of Proposition 12.1.12 


Our aim now is to fill in the gap left in our proof of Theorem 12.1.13 by showing 
how to derive the xt axiom of choice from WCWO. There are three main definitions 
we will need: indecomposable well orderings, the explosion of a tree, and the wedge 
product. We will also need a series of technical lemmas concerning these notions. 
Some of these are straightforward and left to the exercises. Others require a somewhat 
delicate uses of WCWO. 


Definition 12.1.16. Let X be a well ordering. 


1. A coinitial segment of X is the restriction of <x to {x € X : x >x y} for some 
y € X. We call this the coinitial segment above x. 
2. X is indecomposable if X = Y for every coinitial segment Y of X. 


Lemma 12.1.17. The following is provable in RCAo. Suppose X, Y, and Z are well 
orderings, X is indecomposable, and X — Y + Z. Then either X = Y or X — Z. 


The proof is left to Exercise 12.5.5. 


Lemma 12.1.18. The following is provable in RCAg+WCWO. Suppose Xo, ..., Xm-1 
are well orderings such that Xm-1 — ++: — X1 — Xo. Then Xm_-1 — Xo. 


Proof. We argue in RCAg. Assume WCWO. For each n < m—1, let Y,, = Xn41 + Xn, 
and let Yn) = Xm-1. Then, let Wo = Myem Yn, and let W, = (Syem Xn) + 1. An 
element of Wo is thus formally a pair (y,n), where n < m and y € Y,. For ease of 
notation, we will refer to this element simply by y, and in this way identify each 
Y, X {n} © Wo with Y,,. Similarly, we identify each (X, x {n}) x {0} C W, with X,, 
each X,,4; x {0} C Y, with X,41, etc. The context will make the usage clear. 

By Lemma 12.1.3, each of Wo and W, is a well ordering. Hence, by WCWO, we 
have one of the following. 


Case 1: Wo = W\. Fix a witnessing embedding f: Wo — W,. We claim that for all 
n<m, f(Oy,) 2w, Ox, and f(Y,) © Xn. Note that these are both my statements in 
n (using f, Wo, and W; as parameters), so we can proceed by induction (or really, 
LI’). 

We begin with the first claim. This is obvious for n = 0, since Ox, is also the 
least element of Wj. If the claim fails, we can find the least positive n < m such 
that f(Oy,) <w, Ox,. Since f is order preserving, y <yw, Oy, for every y € Yn-1, 
and f(Oy,_,) 2w, Ox,_, by assumption, it follows that Y,,; U {Oy,} = X,-1 via 
f. But Y,-1 = X; + Xn-1, so this implies that X,;-; <> Xy-1,which is impossible 
(Exercise 12.5.3). So the first claim holds. 

Now for the second claim. In light of the first, we have this for n = m — 1 
since Yy-1 = Xm_1. So if the claim fails, we can fix the least positive n < m 
such that f(Y,,) © X,. In particular, f(Oy,) € X,, and by the first claim we have 
f (Oy,_,) 2w, Ox,_,. Since f is order preserving and Oy, , <wy y <wo Oy, for all 
y € Y,-1, the fact that f(Oy, ,) ¢ X,-1 means that there is a coinitial segment Z of 
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Y,-1 such that f(Z) ¢ X,,. Thus f witnesses that Z = X,,. Since Y,_) = Xn +Xn-1, 
we may take Z to be a coinitial segment of X,_;. But X;,_ is indecomposable, so 
then X,-1 — Z = X, — X,-1, which is impossible. So the second claim holds, 
too. 

By the second claim, f witnesses that Y, <= X,, for alln < m. Forn <m-—1, we 
have Y, = Xn+1+Xn, hence f also witnesses that Xj;41 — X,.Foreachn < m—1, let 
Jn denote the restriction of f to X,+ (inside Y,,). Then the composition hgo- + -ohm-2 
witnesses that X,-1 — Xo. 


Case 2: W, = Wo. We will show this case cannot hold. Fix an embedding g: W; —> 
Wo. We claim that for each n < m, g(Ox,,) 2wo Oy,,. We proceed by induction. The 
claim is obvious for n = 0, since Oy, is also the <y,-least element of Wo. If the claim 
fails, we may consequently fix the least positive n < m such that g(Ox,,) <wy Oy,. 
Then also g(x) <w, Oy, for every x € Xy_1. Since g(x) >w, Oy,_, for every 
x € X,-1 by assumption, g witnesses that X,_1 U {0x, } embeds into Y,_1. Thus 
Xn-1  Yn-1 = Xn + Xn-1. By Lemma 12.1.17, and the fact that X, — Xy_1, we 
conclude that X,-1 — Xyj-1, which is impossible. The claim is proved. 

Now, in particular, we have g(Ox,, ,) >wo Oy,,_,- Thus, g witnesses that X,,—-1 U 
{lw,} embeds into Y,,-1 = Xm-1, where lw, is the <w,-largest element of Wy. 
But this means that X;,-1 —* Xm-_1, which is again impossible. This completes the 
proof. oO 


Lemma 12.1.19. The following is provable in RCAg + WCWO. There is no sequence 
(X, :n € N) of indecomposable well orderings such that for each n, Xn+1 © Xn. 


Proof. Assume WCWO and suppose, towards a contradiction, that we had such a 
sequence, (X, :n € N). As in the previous lemma, let Y,, = X,,4; + X,, for all n, and 
let Wo = Sinen Yn. Let Wi = (Spew Xn) + 1. By Lemma 12.1.3, each of Wo and W; 
is a well ordering, so we have the following case analysis. 


Case 1: Wo = Wj. Fix an embedding f: Wo — Wj. As in Case | of the previous 
lemma, we can argue that f(Oy,) >w, Ox, for all n. We claim that also f(Y,) © Xn. 
Suppose not, as witnessed by n. Choose n* so that f(Oy,,,) € Xn«. Then by Lx? 
we may choose the largest m < n* so that f(Y,) A Xm # @. Then f(Z) C X,, for 
some coinitial segment of Y,,, and since Y,, = Xn+1 + Xn, we may take Z to be a 
coinitial segment of X,,. Since X,, is indecomposable, it follows that X, — X,,. But 
fn) £ Xn, thus necessarily m > n. But then X,, — X, and so X;, — X;,, which 
is impossible. 


Case 2: W; —= Wo. Let g: W; — Wo be an embedding. As in Case 2 of the previous 
lemma, we can argue that g(Ox,,) >w, Oy, for all n. Fix nso that g(1w,) € Yn, where 
ly, is the <w,-largest element of W;. Then 


Oya <Wo 8 (Ox) <Wo g(1w,) <Wo Ovni 


which is a contradiction. oO 
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We now add the following definition of the so-called explosion of a treeT C N<N, 
which is similar to the double descent tree 7 * N<“ (Definition 12.1.8). 


Definition 12.1.20. Let T be a subtree of N<“. Then E(7) is the set containing () 
and all finite strings of the form 


((x05 ko). +++ Gina, Kn-1)) € NSN, 
where (xo,.--,Xn-1) € T and ko,..., Kn; € N are arbitrary. 
Clearly, E(T) is always a tree. 


Lemma 12.1.21. The following is provable in ACAo. If T is a well founded subtree 
of N<N then KB(E(T)) is an indecomposable well ordering. 


Proof. We argue in ACAo. Fix T. Any path through E(T) has the form f ® g, where 
f,g € NN and f € [T]. So by assumption, E(7) must be well founded. Hence, 
KB(E(T)) is a well ordering by Proposition 5.8.8. It thus remains to show that 
KB(E(T)) is indecomposable. Consider any a € E(T) and consider the coinitial 
segment of KB(E(T7)) above a. Say a(0) = (x, k) and fix m so that (0,m) > (x, k) 
as codes (i.e., as numbers). By the definition of the pairing function, it follows that 
(y, J) > (x, k) for all y and all 7 > m. Let S be the set of all @ € E(T) such that if 
B(O) = (y, j) then 7 > m. Then S C Z, and we define f: E(T) — S as follows: for 
B € E(T) with B(0) = (y, j), let f(B) = (y, j +m) ~ BC) --- (Bl — 1). Now itis 
not difficult to verify that f is an embedding, so E(T) <= Z. Since Z was arbitrary, 
we are done. oO 


The last technical definition we need is the following. 


Definition 12.1.22. The following definition is made in RCAg. Let (7, : n € N) be 
a sequences of subtrees of N<. 


1. A wedge of (T,, : n € N) is a sequence w = (a0,...,@m-1) for some m € N 
where a, € T, and |a,| = m —n. We say w has size m. 
2. A wedge w* extends a wedge wif w* = (ap,..-,@._,) and w = (a,.-.,@m-1) 


for some m* > m and a}, > ay for all n < m*. 

3. The wedge product of (T, : n € N), denoted [J,, ex Tn, is the set of all (codes for) 
sequences of the form (wo, ...,Wm-—1), where each wy, is a wedge of (7, : n € N) 
of size n+ 1 and wyj41 extends wy. 


Clearly, [],,c~ Tn is itself a subtree of N<“. Our final two lemmas establish basic 
properties of this tree. 


Lemma 12.1.23 (Friedman; see Hirst and Friedman [114]). The following is 
provable in RCAo. For each sequence (T, : n € N) of subtrees of N<“ such that 
3 WE([] en In), there exists a sequence (f, :n € N) such that f, € [T,| for each n. 


The proof is left to Exercise 12.5.4. 
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Lemma 12.1.24 (Friedman; see Hirst and Friedman [114]). The following is 
provable in RCAg + WCWO. If (Ty, : 2 € N) is a sequence of subtrees of N<“ such 
that 7 WF(T,,) for all n, then 7 WF([] new Tn)- 


Proof. Assume WCWO. By Proposition 12.1.11, we can argue over ACA. Let (T;, : 
n € N) be given such that s WF(T,,) for all n. Seeking a contradiction, assume 
WF([ nen In). Fix p: N — N so that for each i € N there are infinitely many n with 
p(n) =i. Then for each s, any path through [],,5., p(n) arithmetically defines a path 
through [],,en Zn (using p and (T,, : n € N) as parameters). Thus, we must have 
WF([n>s Tp(n))- By Lemma 12.1.21, KB(E(T] p35 Tp(n))) is an indecomposable 
well ordering. We will show that for each s, 


KB(E( | | Zpom)) 2 KBE | Tm). (12.4) 


n>stl n2s 


thereby contradicting Lemma 12.1.19. 

So fix s, along with any f € [7;]. We will use f to help us define an embedding 
h: KB(E( Tnss41 Lp) 2 KB(E( Ins Lp(n)))- First, we define a function g on 
the set of all pairs (w, k) where w is a wedge of (7, :n > s+1) and k €N. If w has 
size m, then let 


g((w,k)) = (f fmt 1,w(0),...,w(m — 1)), max{g(x) : x < (w,k)} 41). 


Note that (f [m+ 1,w(0),...,w(m — 1)) is a wedge of (T, : n 2 s) of size 
m +1. (That is, w(0) is an element of T,,; of length m, so we prepend f [m+ 1, 
an element of 7; of length m + 1.) Now define h as follows. Given an element 
z= ((wo, ko), ++, (Wm-1,km-1)) € KB(E([Tnss41 Tp(ny)), let 


A(z) = ((f(0),0), g({wo, ko)))s «+ «+8 ((Wm-15 km-1))): 


Then h(z) € KB(E(T],35 Tp(n))), and it is not difficult to check that h defines an 
embedding. Moreover, by the usual properties of the pairing function, h(z) <xg 
(f (0), 1) for all z, so A actually witnesses (12.4). im 


At last, we are ready to prove Proposition 12.1.12. 


Proof (of Proposition 12.1.12). Assume WCWO. By Proposition 12.1.11, we can 
argue over ACAp. Let w(n, X) be a xt formula such that (Vn)(AX)w(n, X). By 
Corollary 5.8.4, there is a sequence of trees (7, : n € N) such that for every n we 
have (VX)[wW(n, X) © (Ag € NN)[X @ g € [T,]]]. Since — WF(T,) for all n by 
assumption, so also = WF([T,en Tn) by Lemma 12.1.24. Applying Lemma 12.1.23 
we find a sequence (f,, : n € N) such that f, € [T,] for each n. Writing each f, as 
Xn ® Zn, we have that y(n, X,) holds. Then (X,, : n € N) satisfies the conclusion of 
the xt axiom of choice, and the proof is complete. oO 


12.2 Descriptive set theory 439 


12.2 Descriptive set theory 


We have already seen some ways we can “speak about” collections of sets and 
functions in £2, even though the language itself only has variables for numbers and 
sets of numbers. In this section, we survey representations for Borel and analytic 
subsets of Baire space and Cantor space. We frame our discussion in terms of the 
former, but an analogous development works for the latter. 

The concept of a code for a Borel subset of w® is familiar from effective descrip- 
tive set theory. (See, e.g., Mansfield and Weitkamp [201].) Computable codes are 
used to define the lightface Borel hierarchy. We will use the following formulation. 
In RCAo, fix an enumeration (a9, @1,...) of all elements of N<N. 


Definition 12.2.1 (Borel codes). The following definition is made in RCAg. A Borel 
code is a well founded tree B € w® with the property that there is a unique mg € w 
such that (mg) € B. 


The precise nature of the coding is better explained with the help of the following 
definition. 


Definition 12.2.2 (Evaluation maps). The following definition is made in RCAo. 
Let B C N< be a Borel code, and fix any X € w®. Then f: B — 2 is a evaluation 
map for B at X if the following conditions hold for all a € B. 


1. If a isa leaf of B, then: 


° if a(|a| — 1) <2 then f(a) = a(la|— 1), 
° if a(|a| — 1) = 2i+ 2 for some i € w then f(a) = 1 a; < X, 
* if a(|a| — 1) = 21+ 3 for some i € w then f(a) =1 4 a; X X. 


2. If a # {) and a is not a leaf of B, then: 
* if a(|a| — 1) is even, then f(a) = 1 © (Ax)[ax € BA f(ax) = 1], 
* if a(|a| — 1) is odd, then f(a) = 1 © (Vx)[ax € BA f(ax) = 1]. 
3. f({)) = f({me)). 


X belongs to B, written X € B, if there is an evaluation map f for B at X such that 
f(«)) = 1. X does not belong to B, written X ¢ B, if there is an evaluation map f 
for B at X such that f({)) = 0. Note that € here is being used as an abbreviation, not 
as the symbol of £2. 


The idea is that among the leaves of B, those with last bit 0 code the empty set. 
Those with last bit 1 code N“. And those with last bit 2i + 2 + j for j equal to 0 or 
1 code [[a;]] and N% \ [[a;]], respectively. All other nonempty strings in B code 
unions or intersections, depending as their last bit is even or odd. For example, the 
open set U,cy[[@]] can be coded by B ¢ N< containing the strings () and (0), 
and then for each i such that a; € U, the string 0 ~ (27 + 2). Clearly, Borel codes are 
in general not unique. 

Given B and X as above, an evaluation map for B at X can be constructed by 
transfinite recursion along the rank of B. The following theorem formalizes this fact. 
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Theorem 12.2.3 (Simpson [288]). ATRo proves that for every Borel code B © NN 
and X € NN, there exists a unique evaluation map for B at X. 


Proof. Uniqueness is actually provable in ACAo, and this is left to Exercise 12.5.6. 
To prove existence, we argue in ATRo. Fix B and X. Since WF(B), we have by 
Lemma 12.1.6 that rk(B) < KB(B). Let g: B — KB(X) be the function given by 
Lemma 12.1.5. So 


g(a) = sup{Sxp(a)(g(B)) : BE BAB > a} 


for alla € B. Let Z = range(g) \ {g({))}, regarded as a suborder of KB(B). Clearly, 
WO(Z). We define a sequence of sets (Fy : x € Z) by arithmetical transfinite 
recursion along Z. Let Fo, be the set of all pairs (a, v) such that g(a) = Oz (so a is 
a leaf of B) and v < 2, with v = 1 if and only if a(|@| -— 1) = 1, or a(ja|- 1) = 21+2 
and a; < X, or a(|a| — 1) = 2i+3 and a; X X. For x >z Oz, we let F, consist 
of all pairs (a, v) such that g(@) = x and v < 2, with v = 1 if and only if either 
a(|a|— 1) is even and (8, 1) € F,:g) for some 8 > a of length |a| + 1, or a(|a| — 1) 
is even and (6,1) € Fg) for all B > a@ of length |a| + 1. Let vg be such that 
({mgp),UB) © Fe((mg)) Then f = Uxez Fx U {{{), vs)} is an evaluation map for B 
at X. oO 


Classically, given a Borel code B, the collection of X € w® such that the evalua- 
tion map f for B at X satisfies f({)) = 1 is a Borel set. Conversely, given any Borel 
set B there is a code B C w< satisfying Definition 12.2.1 such that the elements 
of 8 are precisely those X € w® such that the evaluation map for B at X satisfies 
f(<)) = 1. In this sense, Borel codes do a good job of representing Borel sets. 

There is one subtlety, though, which is that the proof of Theorem 12.2.3 does not 
go through in the absence of ATRp. In fact, it does not even go through for trivial 
Borel codes (so called in [78]), which are the codes such that a(|@| — 1) < 2 for all 
leaves a. The only Borel sets such codes can represent are © or w”, so defining an 
evaluation map may seem trivial. But, per Definition 12.2.2, evaluation maps must 
consistently assign values to all the elements of a given code, which can be quite 
complicated even if the “eventual” Borel set is trivial. This is an insurmountable 
problem. 


Theorem 12.2.4 (Dzhafarov, Flood, Solomon, and Westrick [78]). The following 
are equivalent over RCAg. 


1. ATRo. 

2. For every Borel code B C NN and X € N®, there exists an evaluation map for 
Bat X. 

3. For every trivial Borel code B © NN and X € NN, there exists an evaluation 
map for B at X. 


This is a key issue because, without an evaluation map, the relations X € B and 
X ¢ B fora Borel code B hold no meaning. In particular, the preceding theorem has 
the following somewhat surprising consequence. 
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Corollary 12.2.5. The following are equivalent over RCAo. 


1. ATRo. 
2. For every Borel code B © NN and X € NN, either X € Bor X ¢ B. 
3. For every Borel code B © NW, there exists X € NN such that X € B or X € B. 


The consequence of this phenomenon is that even very basic results about Borel 
sets tend to immediately jump up at least to the level of ATRg in complexity. (In 
the parlance of Section 1.3, Borel codes impose a significant coding overhead.) 
Recently, Westrick (see [78]) has proposed the concept of a determined Borel code, 
which consists of a Borel code B together with the assertion that for each X an 
evaluation for B at X exists. Formulating a theorem about Borel sets in terms of 
determined Borel codes escapes the problems suggested above and, in practice, 
the reverse mathematics analysis seems more faithful to the “true” strength of the 
theorem. (Examples may be found in [78], as well as in newer papers by Astor, 
Dzhafarov, Montalban, Solomon, and Westrick [5] and by Towsner, Weisshaar, and 
Westrick [312]). 

Another important collection of subsets of Baire (or Cantor) space is the collection 
of analytic sets. These are the subsets definable by pa formulas of £5. (We may 
regard a xt formula y(X) as being defined on X € NN by identifying X with the 
characteristic function of its graph as a function.) Analytic sets, too, admit a more 
combinatorial representation. 


Definition 12.2.6 (Analytic codes). The following definition is made in RCAo. 
Given atree A C N<“, anelement X € NN isapointof Aif (Af ¢ NY)[f@X ¢€ [A]]. 


It is customary to call the tree A an analytic code in this context. Clearly, the 
collection of all points of a given analytic code is definable by a Pa formula. The 
next result establishes the converse. 


Theorem 12.2.7 (Simpson [288]). [fy(X) isa = formula, then ACAg proves there 
is an analytic code A C N<N such that for all X, p(X) © X is a point of A. 


Proof. By Kleene’s normal form theorem (Theorem 5.8.2), there is an arithmetical 
formula @ such that 


ACAg + (VX) [9(X) (Af Ee NY) (WK)[O(X PK, f PADI. 


Let A be the set of all sequences ((a9, Bo),---,(@n-1,Bn-1)) € N<N such that: 


° a S++ 8 An-1, 

* Bo S++ S Bn-, 

° |a;| = |B;| =i for alli <n, 
Fs O(Qn-1, Bn-1) holds. 


Then A is a tree and y(X) © X is a point of A. oO 
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The theorem sets the stage for deriving most of the standard facts about the rela- 
tionship between analytic and Borel sets. For starters, note that by Definition 12.2.2, 
belonging and not belonging to a given Borel code are each definable by a xt for- 
mula. Thus, Theorem 12.2.7 yields the following corollary, formalizing the fact that 
every Borel set is analytic. 


Corollary 12.2.8. The following is provable in ATRo. If B ¢ N<N is a Borel code, 
there exist trees Ap, Ay C N<N such that for all X, X € B <— X is a point of Ao, and 
X ¢ Be X isa point of A}. 


Note that, by the uniqueness of evaluation maps in ATRo, no X can both belong and 
not belong to a given Borel code, so the trees Ag and A, in the corollary are (codes 
for) each other’s complements. 

ATRo can also prove the converse: any analytic set whose complement is also 
analytic must be Borel. This follows from the following theorem on the strength of 
Lusin’s separation theorem from descriptive set theory, which is part (2) below. 


Theorem 12.2.9 (Simpson [288]). The following are equivalent over RCAo. 


1. ATRo. 

2. If Ap, Ay © N<N are analytic codes such that no X is a point of both Ag and A\, 
then there is a Borel code B such that for all X, if X is a point of Ao then X € B, 
and if X is a point of A, then X ¢ B. 


We omit the proof for brevity. The reversal follows by observing that Lusin’s sepa- 
ration theorem easily implies the ot separation principle (Theorem 5.8.20), whose 
equivalence to ATRo we saw in Section 12.1.1. The following corollary is now 
immediate. 


Corollary 12.2.10. The following is provable in ATRo. If Ao, A, © N<N are analytic 
codes such that every X € N® is a point of exactly one of Ao and Aj, then there is a 
Borel code B © N< such that for all X, X is a point of Ay @ X € B. 


Corollaries 12.2.8 and 12.2.10 give a proof in ATRo of the classical Souslin’s theorem, 
which states that a set is Borel if and only if it and its complement are analytic. This 
theorem is not equivalent to ATRo, however (see Simpson [288, Remark V.3.12]). 

We conclude this section by mentioning one further consequence of Theo- 
rem 12.2.7, which will be of independent interest to us in Section 12.3.2. The 
proof features a satisfying use of self reference. 


Corollary 12.2.11. [f y(X) isa 2} formula then ACAg proves there is an X such that 
3(y(X) — WO(X)). 


Proof. We argue in ACAg. Given a tree A C N<N and X € NN, define 
Ta(X) = {a EN“: (Vk < Jal)[(( F0, X 1.0),..., (a PK, X PK)) € Al}. 


Then 7,4(X) is a tree, and X is a point of A if and only if s WF(T,4(X)). Now fix g, 
and let W(X) be the formula y(KB(Tx (X))), which is again pe By Theorem 12.2.7, 
there is an A such that for all X, w(X) © 4 WF(T,4(X)). Then we have 
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g(KB(T4(A))) @ (A) © > WE(T4(A)) @ > WO(KB(Ta(A))), 


by Proposition 5.8.8. Taking X = KB(74(A)) yields the result. Oo 
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In this section, we consider so-called Gale—Stewart games, also known as games 
of perfect information. These consist of two players who alternate playing natural 
numbers, with the first player winning if the resulting sequence of numbers produced 
during the play belongs to a predetermined class of elements of w®, called a winning 
class, and the second player winning otherwise. A game is said to be determined if 
one of the players has a winning strategy. In set theory, the axiom of determinacy states 
that every game is determined. This has many interesting consequences including, 
perhaps most famously, that every set of reals is Lebesgue measurable (Mycielski and 
Swierczkowski [229]). Of course, this means the axiom of determinacy is provably 
false in ZFC because, as is well known, the axiom of choice implies the existence of 
nonmeasurable sets. 

The tension between choice and determinacy inspired a great deal of work into 
just “how much” determinacy ZFC does admit. This is usually calibrated by the 
topological complexity of the winning class. One of the earliest results here is due 
to Gale and Stewart [121], who showed that open determinacy (determinacy for 
games in which the winning class is open) is provable in ZFC. Wolfe [328] extended 
this to F, determinacy, and Davis [62] to F5 determinacy. Then, in a sweeping 
generalization, Martin [203] showed that ZFC actually proves Borel determinacy. 
And this, it turns out, is the limit: even analytical determinacy already requires 
stronger set theoretic axioms. 

Our interest, of course, is in the reverse mathematics content of the above results. 
As we will see in Section 12.3.4, the amount of determinacy provable in Z», as 
opposed to ZFC, stops well short of Borel. And even low levels of determinacy 
require comparatively strong axioms to prove (Section 12.3.2). We begin in the next 
section with some basic definitions. A more thorough account of the classical theory 
of determinacy can be found in many texts on set theory or descriptive set theory 
(see, e.g., Kechris [177]). 


12.3.1 Gale—Stewart games 


The main object of interest for determinacy is the following. 


Definition 12.3.1. Fix S ¢ w®. The (Gale—-Stewart) game G(S) is a two-player 
game in which Players I and I alternate playing elements of w, with Player I 
going first and playing no, 2,... and Player II playing 71, 73,.... If the sequence 
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non\n2n3--- € w® belongs to S then Player I wins; otherwise, Player II wins. We 
call S the winning class for G(S). 


A play of G(S) is typically denoted by r ® s, where r,s € w®, with r(i) = 9; 
indicating the moves of Player I, and s(i) = n2;+1 indicating the moves of Player II. 
(Technically, r ® s is just r @ s, but we will use the traditional notation where 
appropriate.) 


Definition 12.3.2. Fix S C w®. 


. A strategy is a function f: w<@ 


. A strategy f: ws? — w is winning for Player I in G(S) if for all s € w%, if 
r € w is defined by r(i) = f(s fi) for alli, thenr @s €S. 

. A strategy f: ws? — w is winning for Player II in G(S)if for all r € w®, if 
s € w® is defined by s(i) = f(r fi+1) foralli,thenr@s¢S. 

4. G(S) is determined if there is a winning strategy for Player I or Player II. 

5. If Tis a collection of classes of elements of w®, T° determinacy is the assertion 

that G(S) is determined for every S € I. 


> Ww. 


Ne 


WwW 


We consider collections I corresponding to the usual topological classes, e.g., open, 
F,, Borel, etc. These are usually described in terms of the boldface Borel hierarchy, 
e.g., zi, =, Ay , etc. It is known that boa determinacy is equivalent to n° determinacy, 
for all n (see, e.g., Hachtman and Palumbo [131, Section 4] for a concise proof). 

In what Simpson [288, p. 210] calls “one of the earliest results of reverse math- 
ematics”, Steel [302] showed that clopen (i.e., A°) and open (i.e., x) determinacy 
are both equivalent to ATRo, which we will see proved in the next section. On the 
other hand, Friedman [107] showed that z. determinacy is not provable even in full 
Z. A proof of Friedman’s theorem, in fact of a strengthening to a due to Martin 
(unpublished; see [204]), appears in Section 12.3.4. 

Understanding the levels between zi and xi has been the focus of considerable 
work. Tanaka [305] showed that determinacy for intersections of open and closed 
classes is equivalent to TI;-CAo. Tanaka [306] also characterized Phy determinacy, 
albeit not in terms of any of the usual subsystems of Z2. MedSalem and Tanaka [208] 
showed that AS determinacy follows from Ay -CAo plus a strong transfinite recursion 
principle; and Welch [326] showed that xe determinacy is provable in TI;-CAo. More 
recently, Montalban and Shore [222] showed that the dividing line between classes 
for which determinacy is and is not provable in Zp falls strictly between = and 
pa (in fact, A). For n > 1, say aset S C w<% is n-TIt if there exists re classes 
So, S1,...,5, such that S$, = @ and S = (Sg \ S})U(S2\ S3) U- ++. Thus, the 1-1 
classes coincide with ms classes, the 2-119 classes are differences of ms classes, etc. 
Let n-IIS also denote the collection I of all n-TI} classes. 


Theorem 12.3.3 (Montalban and Shore [222]). Fix n > 1. 
1. TH 


n+2 


gk} 


n+2 


-CAo proves n-TIS determinacy. 
-CAo, even together with full induction, does not prove n-IIS determinacy. 
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Since Z» can be written as U,5; Al-CAo, it follows by compactness (of first order 
logic) that Z2 ¥ (Vn) [n-T1} determinacy]. So Zz cannot prove, e.g., AY determinacy, 
which is a further extension of Friedman’s theorem. The proof lies outside the scope 
of this book, but it uses some of the same set theoretic methods that we survey in 
Section 12.3.3. 

For a more complete overview of the above development, including many addi- 
tional related results leading up to, and stemming from, Theorem 12.3.3, see the 
discussion of Montalban and Shore [222, Section 1]. 


12.3.2 Clopen and open determinacy 


In this section, we analyze the reverse mathematics strength of clopen and open deter- 
minacy. These statements turn out to be equivalent, and require precisely arithmetical 
transfinite recursion to prove. 

By Proposition 2.8.2, if S is open there is a tree T €C w® with S = w® \ [T]. 
Thus, Player I wins a run of G(S) if and only if the constructed sequence is not an 
element of [7]. 

If S is clopen, we may fix two trees, 7),7, € w®, with [To] = S and [T,] = 
w® \ S. Here, it is convenient to think of a slightly different game. Let T = Ty) N7;. 
Players I and II play no, 11, 2, .. . as before, but now subject to the condition that for 
alli € w and j < 2, n2;4; can only be played if (non, «++ noi+;) € Ext(T;). Since T 
is well founded, it follows that every run of this game is finite. We declare whichever 
player reaches a leaf of T first to be the winner. It is then easy to check that a player 
has a winning strategy in this game if and only if that player has a winning strategy 
inG(S). 


Theorem 12.3.4. The following statements are equivalent over RCAo. 


1. ATRo. 
2. Open determinacy. 
3. Clopen determinacy. 


Proof. (1) — (2): Fix a tree T C N<“. Given a well ordering X, we define O, C 
N<N x EX, by transfinite recursion. Let Oo, = {a : a ¢ T}, and for x >x Ox, let 


Ox = {a: |a| even A (An)[an € Ss Oy] or lal] odd A (Vn) [an € 'S Oy]}. 


y<xx y<xx 


Let Ox = Uxex Ox. It is not difficult to see that if there is an X such that WO(X) 
and () € Ox, then Player I has a winning strategy. So suppose instead that there is 
no such X. We claim that, in this case, Player II has a winning strategy. 

Let W(X) be the formula that says that X is a linear ordering with least element 
Ox and there exists P C X x N<N such that P!°x! = {a : a@ ¢ T}, and for x >x Ox, 


Pl*] = {@: |a| even A (An) [an € P’<**!] or Ja] odd A (Vn) [an € PY’<x*]]}, 
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and () ¢ Py = Usey Pr- 

By hypothesis, WO(X) — W(X) for all X, because if X is a well ordering we can 
form (O, : x € X) and let this be P (i.e., P = U,ex {x} x Ox). Now since yw is =!, 
it follows by Corollary 12.2.11 that we cannot have w(X) — WO(X) for all X. So 
there is some set X such that w(X) holds and = WO(X). Let x9 >y x; >y --- be an 
infinite descending sequence in X. We want to show that Player II can always stay 
outside of Po, (and thus inside of T). 

We prove that on the ith move, Player II can make sure its move is outside of P,,. 
The strategy then is: if |a| = i, play ann such that an ¢ Px,, and hence such that 
an € T (since T is closed downwards under <). We can prove that this is always 
possible. Indeed, using the definition of P,., and the fact that () ¢ Py, it follows by 
induction that 


* if |a| is even anda ¢ P,, then (Vn) [an ¢ Px,,,], 
* if |a| is odd and a ¢ Px, then (An)[an ¢ Px,,, ]. 


This proves the claim. 
(2) > (3): Clear. 


(3) — (1): We use the alternate formulation of clopen games described above. We 
argue in RCAo, and first claim that (3) implies ACAo. To better match the rest of the 
argument, we prove that Z’ exists for ever set Z, which is equivalent (Corollary 5.6.3). 
We take Z = @. The general case is analogous. We define a game, a play of which 
proceeds as follows. 


¢ Player I playsn € N. 
¢ Player II plays | or 0, which we think of as “yes” or “no”. 
¢ Player I plays se N. 
¢ Player II plays t € N. 


Player II wins if either it has played “yes” and then ®, (1) [t] |, or it has played “no” 
and ®,,(n)[s] T. Intuitively, Player I proposes a candidate n for membership in 2’, 
and Player II says whether or not it thinks 7 is in @’. If Player II thinks n ¢ @’, the 
burden of proof is on Player I to find an s so that ®,(n)[s] |. If Player II thinks 
n € @’, the burden of proof is on it to find a t such that ®, (n)[t] |. 

This is a clopen game, if we think of the players as always having one more turn 
if they have not yet lost (so for example, if Player II says “no” and Player II plays an 
s such that ®,,(1)[s] 7, then Player I can play once more and the game stops). The 
longest run of a play is therefore 5, which is why the tree of plays is well founded. 

Now Player I cannot have a winning strategy for this game, since for each n € N, 
Player II can always respond with the correct answer about whether or not n € @’. 
Thus, by clopen determinacy, Player II has a winning strategy f, and we have 


(Vn)[n € ©’ & f((n)) = “yes”. 


So, we can argue in ACA from now on. 
Now fix X such that WO(X). We extend the above argument to show that a 
exists. By Theorem 5.8.18 (and relativization), this implies ATRo. Without loss of 
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generality, we may assume X has no <x-largest element. We build trees T,,,, for 
x € X such that Player I wins in T,,,, if and only ifn € OXI) To begin, let Tyo, 
be the tree corresponding to the game in which Player I plays an s € N, and then 
Player II plays some rt € N if and only if ®,(x)[s] T. Then Player I wins this game 
if and only ifn € @’. 

Now suppose x € X is the successor of y € X under <x, and suppose T,,,, has 
been defined for all m. Let T,,,, be the tree corresponding to the following game. 


* Player I plays a (code for a) o € 2“ such that ®7(n) |. 

¢ Player II chooses m < |a]. 

* Ifo(m) = 1, Player II plays m, and on subsequent moves we start playing Tiny, 
with Player I starting. 

* If 7(m) = 0, we start playing Tin,y, with Player II starting. 


Intuitively, the play of o above represents an initial segment of @ '») long enough 
so that ® (n) |. Since we have convergence, we only need to check that o is indeed 
an initial segment of @‘* '»), and for this Player II queries the bits of o to see if they 
are in @'* 1!) If m < |o| is claimed (by ) to be in @'* !»), the burden of proof 
is on Player I to win T,,,, whereas if m is claimed not to be in @\*T¥) | the burden 
of proof is on Player II to win 7,,,,,, acting as the first Player In that game (which 
means Player I does not win that game). 

Now suppose x is a limit under <x and that 7, has been defined for all m and 
all y <x x. Say n = (m, y). Then let T,. = {()} if y =x x or y ¢ X, and otherwise 
let T,,x = Tm,y. Then Player I wins T;,,, if and only if n = (m, y) for some y <x x 
and Player I wins T,,,5. 

In the definition above, the sequence (T,,,x : n € N) is uniformly AV definable from 
(Tm,y 1m € N,y <x x), and the distinction between whether x ix Ox, a successor, 
or a limit, is arithmetical in X. Hence, using arithmetical comprehension and arith- 
metical induction in ACAo, we can form (Ty, : x € X,n € N) (Exercise 5.13.20). 
Thus, we can define the following game. 


1. Player I plays (n,x) for some x € X. 

2. Player II chooses “yes” or “no”. 

3. If “no”, Player II plays “no” and on subsequent moves we start playing 7;,,, with 
Player I starting. 

4. If “yes”, we start playing 7,,,,, with Player II starting. 


This is a clopen game since all of these trees 7,,,, are well founded. By clopen 
determinacy, one of the two players has a winning strategy for it, but as above, it 
cannot be Player I. Thus, Player II has a winning strategy f, and we have 


(Vn)[n € @® & (Ax € X)[f ((n, x)) = “yes”. 


This completes the proof. oO 
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12.3.3 Gédel’s constructible universe 


In this section, we quickly collect several definitions and lemmas concerning Gédel’s 
constructible universe, L, which we will need to prove Friedman’s theorem. Let ZFC 
denote ZFC without the power set axiom. The following gives an important connec- 
tion between models of ZFC” and models of Z2. The proof is left to Exercise 12.5.7. 


Proposition 12.3.5. If M & ZFC”, then (w, MN P(w)) & Zo. 


Thus, one way to show that Z) does not prove a certain result (for example, some 
amount of determinacy) is to build a model of ZFC” and appeal to the above theorem. 
This is what we do below. Let us now describe the model of interest. 

We begin with the definition of L. Let Ord denote the class of all ordinals. We 
use lowercase letters to range over sets here, as is customary in set theory, but stick 
to using capitals when wanting to emphasize that a set is a subset of w. (The sets Lo 
and L below are an exception.) 


Definition 12.3.6. We define L recursively along Ord. 


1. Lo =@. 
2. Lavi = {{x € La: La F o(x,a)}: a € Lo, ¢ is a first order formula}. 
BTN) cus Tie, 

Then L is the proper class L = Ugeorg La- 


L was introduced by Gédel [126], in his famous work showing the consis- 
tency with ZF of the continuum hypothesis and the axiom of choice. We iden- 
tify Lq and L with the structures (Le, €) and (L,€), respectively. L satisfies 
ZFC + “there is a well ordering of the universe”. This well ordering, called <z, is 
defined by putting a canonical well ordering on each La, and ordering all elements 
in Logs, \ Lg above all elements of Ly. 

We will make use of the following two lemmas. 


Lemma 12.3.7. There are unboundedly many a < w, such that Lq is an elementary 
substructure of Lw,- 


Proof. Fix any a < wy. Let Bo = a, and given By, let B41 be the least 6 with 
Bn < y < @, Such that for all y anda € Lg, if Lu, & (Ax) p(x, a) then there is 
some b in L, witnessing this. (That such a y exists for a single choice of y and a 
follows from the fact that w, is a limit. That we can choose a single y that works 
for all y and a@ simultaneously follows from the fact that there are only countably 
many such formulas and tuples, and w has cofinality w;.) Let 6 = sup, an. By the 
Tarski—Vaught test, Lg is an elementary substructure of L,,,. oO 


Lemma 12.3.8. L,, — ZFC”. 


Proof. We check each axiom, one by one. Comprehension for subsets of a set z 
follows from taking some @ < w, such that z € Lg and such that La is an elementary 
substructure of L,,,. Other axioms are similar, or are satisfied simply because € is 
represented by actual set membership. oO 
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We can now define the model that will be of interest to us. 
Definition 12.3.9. Bo is the least 6 € Ord such that Lg & ZFC”. 


By the previous two lemmas, 89 < w1. In particular, Lg, is countable. The following 
theorem lists one of its crucial properties: pa formulas are absolute to Lg,. 


Theorem 12.3.10. [f p(X) is a = formula of Lo, then Lg, = (AX)y(X) if and only 
if Ve (AX) ¢(X). 


Proof. First, note that if X € Lg, with X C w?, then WO(X) is absolute to Lg. 
Indeed, if Lg,  WO(X) then there is an isomorphism f: X — (a, €) in Lg, for 
some a € Ord460, Since Ord““0 C Ord, it follows that f witnesses WO(X) in 
the universe. Conversely, if Lg, ! — WO(X) then there is some infinite descending 
sequence f: w — X in Lg,, and this also witnesses — WO(X) in the universe. 
Now suppose y(X) is hs Then by Corollary 5.8.4, there is a tree T C w® such 
that ACAg + (AX)y(X) — WE(T) — WO(KB(7)). By Proposition 12.3.5, this 
equivalence holds in Lg,. Hence, by the remarks above, we have Lg, § (AX) y(X) @ 
V & (AX) p(X), as was to be shown. ia 


More generally, w-models with the property that a pr formula of £5 is satisfied 
if and only if it is true are called 6-models. By the preceding theorem, the w-model 
(w, Lg, NP (w)) is actually a B-model. There is a rich theory of 6-models and their 
applications in reverse mathematics. Simpson [288, Chapter VII] provides a detailed 
exposition. 

We close with one final definition and lemma, which will be needed for technical 
reasons. After that, we will be ready to prove Friedman’s theorem. 


Definition 12.3.11. We say a model M is well founded if for every a € M, if 
Mt a € Ord, then {8 € M : B €™ a} is well ordered by <™. 


Lemma 12.3.12. Every well founded model M of ZFC” + V = L is isomorphic to 
Lea for some a. 


Proof. Let a be the least ordinal not in M. Then there is a bijection f: a > Ord™, 
By transfinite recursion, build, for 6 < a, an isomorphism gg: Lg — (La)™. Let 
& = Us gp. Then g is the desired isomorphism. oO 


12.3.4 Friedman’s theorem 


Our goal in this section is to prove Friedman’s theorem that xe determinacy is not 
provable in Z). We actually prove a strengthening, due to Martin (unpublished), that 
the same is true for by determinacy. Martin [204, p. 39] says of this proof that it is 
“essentially the same” as Friedman’s. 

At the level of = we may restrict ourselves to games G(S) where S € 2 and 
where Players I and II alternate playing Os and 1s (rather than arbitrary numbers). We 
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will call such games binary, to emphasize the difference. It is not difficult to see that 
pa determinacy for games in (in the full sense of Definition 12.3.1) is equivalent to 
>a determinacy for binary games. In fact, the same is true even for x determinacy. 
However, for lower levels in the arithmetical hierarchy there is a difference. A nice 
discussion is given by Montalban and Shore [222, Section 2]. 

In what follows, we keep in mind a fixed enumeration of all sentences in the 
language of set theory with parameters from Lg,, which is a countable collection. 
We can then regard infinite sequences of Os and 1s (such as plays of binary games) 
as specifying a subset of this collection via its characteristic function. 


Lemma 12.3.13. Suppose G is a binary game in Lg, such that: 


e If Player I plays Thz,, then Player I wins. 
e If not, and if Player II plays Thy, then Player II wins. 


Then Lg, ¥ “G is determined”. 


Proof. First, we prove that Lg, & “Player II has no winning strategy”. Suppose there 
isan f € Lg, such that Lg, & “f is a winning strategy for Player IT’. Then since s is 
winning, for every r € 2°, r @ s ¢ G, so being a winning strategy is a TI; statement. 
By Theorem 12.3.10, s must be a winning strategy also in V. But no strategy can be 
winning for Player II in V, since Player I can always play r = Th Liga 

Next, we prove that Lg, § “Player I has no winning strategy”. Suppose otherwise, 
so that there is some winning strategy s € Lg, for Player I. Then again s is a winning 
strategy in V. Now suppose Player II plays the “copy cat” strategy, which means 
that whenever Player I plays i € {0,1}, Player II plays 7 on the next move. Then the 
outcome of this play must be Thr, @Thz,, = Thr, ®Thz,, This is because, if 
Player I ever played an i which was “wrong” in terms of playing Th Lp (i.e., either 
i = O and g; € Thr, ori = | and g; ¢ Thr,,)» then at the first place that this 
happens, Player II could play 1 — 7 on the next move, and thereafter continue to play 
Th;,,. Since Player I could never correct its mistake, it would end up not playing 
Thz,,, while Player II would, so Player II would win, contrary to the fact that Player 
Tis playing a winning strategy. 

Therefore, we have that Player I plays s({)), then Player II plays s(()), then Player 
I plays s(s())), and so on, meaning that 


5({))s(s({)))s(s(s({)))) +++ = Thy, » 


and therefore Thr, <r s. But then Thr,, is definable by a single formula y, so for 
every formula y we have y € Thy, << Thy, + y("y"). This cannot be, by Tarski’s 
theorem on the undefinability of truth. oO 


Theorem 12.3.14 (Friedman [107]; Martin, unpublished). Z2 does not prove xf 
determinacy. 


Our aim is to build a binary zi game G which satisfies the conditions of Theo- 
rem 12.3.13. This will imply the theorem by Proposition 12.3.5. In G, Players I 
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and II will play strings of Os and Is, as before. Let us call the collections of formulas 
(coded by the sequences) they play 7; and Ty, respectively. 

Let P be I or Il, indicating a player. Let Mp be the term model of Tp defined as 
follows. 


* For each y(x) such that Tp + (A!x) p(x) add a term fy. 
° Let ty ~ ty if and only if Tp + (Vx) [y(x) © w(a)]. 
« Let Mp = {ty}/~. 


Then we can prove Mp & Tp, where Mp F ty € ty if and only if T & (Vx, y)[ p(x) A 
w(y) — x € y]. For this we use the fact that if Tp & (Ax)W(x) then Tp § W(ty), 
where v(x) = w(x) A (Vy <r, x)[-W(y)]. So Mp contains witnesses. 


Proof (of Theorem 12.3.14; see [204]). The game G is defined as follows. 


¢ Players have to play consistent complete extensions of ZFC” + “V=L” + 
“(Va € Ord)[La # ZFC™]”. (Note that this is a mm condition.) 

¢ Players have to play w-models. So if x € M; and M; & x € w then for some 
néw, M; & x =n. Similarly for Player II. (Note that this is a my clause.) 

e Player I wins if one of the following conditions hold. 


— Ord is isomorphic to an initial segment of Ord", 
— There is an a € Ord™ such that a is isomorphic to a proper initial segment 
of Ord™" but a + 1 is not. 


Otherwise, Player IT wins. 


Claim 1: If Player I plays Thy, then M, = Lg, and Player I wins. The first part 
follows because everything in Thy, is definable from a term. To prove the second 
part, there are three possibilities for what can happen. 


Case 1: Ord™' is an initial segment of Ord". Then Player I wins by definition. 


Case 2: Ord™" is a proper initial segment of Ord, Then if My models V = L, 
My = La for some a < Bo by Lemma 12.3.12. But then My does not satisfy ZFC 
by definition of Bo, so this cannot happen. 

Case 3: Ord and Ord™" are incomparable. Then it cannot be that each a € Ord™ 
is isomorphic to a proper initial segment of Ord", and it is not difficult to see that 
the least a@ for which this is not the case cannot be a limit. That is, there is an 
a € Ord™ such that @ is isomorphic to a proper initial segment of Ord™" but a +1 
is not. So Player I wins. 


This proves Claim 1. 


Claim 2: If Player I does not play Thz,, but Player II does, then M, # Lg, and 
Mri = Lg, and Player II wins. The first part is clear. For the second, there are the 
same three possibilities as in the previous claim. 


Case 1: Ord is an initial segment of Ord", Player II wins by the same argument 
used to prove that Player I wins in case 2 of the preceding claim, but with Mj and 
Mr reversed. 
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Case 2: Ord™ is a proper initial segment of Ord. Player II wins because neither 
of the two conditions for Player I to win are met. 


Case 3: Ord and Ord" are incomparable. Suppose a € Ord is isomorphic to 
a proper initial segment of Ord", Since Ord™" is a subclass of the real ordinals, 
there must be some £ € Ord" that is isomorphic to this initial segment. But then 
a+ 1 is isomorphic to 6 + 1, and 6 + | is isomorphic to a proper initial segment 
of Ord", namely the initial segment that § is isomorphic to followed by the least 
element of Ord" not in this segment. Thus, Player I does not win, so Player II wins. 


This proves Claim 2. 

It remains to show that G is wi i.e., that the winning class for Player I has 
a by definition. Note that Ord™' is an initial segment of Ord™" if and only if 
for all X € My, if My — X C w* A WO(x) then there is a Y € My such that 
Mi § Y € w* A WO(y) and X and ¥ represent the same subset of w, which means 
that for every n € w, Mp Enexo Myene y. Thisisa m1? definition. Similarly, 
saying there is an a € Ord™ such that a is isomorphic to an initial segment of 
Ord™" but a + 1 is not is ZY. g 


12.4 Higher order reverse mathematics 


To this point, we have considered reverse mathematics in the setting of second order 
arithmetic. Using higher order arithmetic instead gives a different perspective on 
many theorems. Higher order arithmetic features higher types, such as functions 
from w® to w®, which allow for more direct coding methods. At the same time, the 
higher order computability theory implicit in higher order arithmetic is more com- 
plicated than the classical computability theory implicit in second order arithmetic. 
Thus higher order reverse mathematics is not an extension of second order reverse 
mathematics: it is a genuinely alternative approach with its own techniques, motiva- 
tions, and interpretations. In this section we will sketch the fundamental definitions 
and several results to show how higher order reverse mathematics can help fill out 
our understanding of particular theorems. 

The approach most commonly used for higher order reverse mathematics was pro- 
posed by Kohlenbach [185]. This approach leverages systems of higher order arith- 
metic that are well known in proof theory (see Feferman [104], Kohlenbach [183], 
Avigad and Feferman [10] and Troelstra [313]). In particular, rather than simply 
extending to third order arithmetic or fourth order arithmetic, higher order reverse 
mathematics uses arithmetic in all finite types. In practice, this makes the definitions 
more straightforward, although the higher types (e.g. above fourth order) have little 
role. 

Rather than using a set-based language like RCAg, this approach uses a function- 
based language. The first order part is the same, with operations for numerical 
addition, multiplication, and an equality relation on numbers. At higher levels, instead 
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of a set membership operator €, we have an operation for function composition. 
However, the functions we compose will have various higher types. 


Definition 12.4.1. The collection of finite types is defined inductively as follows. 


¢ Type 0 consists of the natural numbers. 

¢ If o andr are types, then type p — T consists of all functions from type p to type 
T. This type is also written T(p) in the literature. When there is no confusion, 
we write 7 — t > p in place of o > (t — p); this means that — is treated 
as “right associative”. 


The pure types are denoted by natural numbers, and also defined inductively. 


* 0 is a pure type. 

¢ If is a pure type, n + | is the type n — 0. In particular, type 1 = 0 — 0 is the 
type of functions from w to w, and 2 = (0 — 0) — 0 is the type of functions 
from w® to w. 


Every type is equivalent to some pure type, in the sense that the function spaces are 
effectively isomorphic. These equivalences are often easiest to demonstrate by also 
considering product types p X T and using the equivalence known as Currying: 


pro (to o)=(pxXtT) 7 0 zt (pa). 


For example: 


* Type 0 > 1 =0 > (0 — 0) is equivalent to (0 x 0) — 0, which is equivalent 
to type 1 = 0 — 0 through the use of a pairing function. 

¢ Type 1 — 1 = (0 > 0) > (0 — 0) is equivalent to ((0 — 0) x 0) — 0. Using 
an effective pairing method again to convert (0 — 0) x 0 to 0 — 0, we see that 
1 — 1 is equivalent to type 2 = (0 — 0) > 0. 


The degree of a type p is the number nv so that p is equivalent to pure type n. 

We use J notation to informally describe a function of a given type. If T7 (x7) is 
a term of type o with a parameter for a variable x of type T, the notation Ax*.T° (x) 
refers to the function f: tT — o sending each z of type t to T(z). 

The precise collection of allowable terms T° will vary with the particular formal 
system we are using, but at the very least we will have an infinite collection of 
variables of each type. The notation x‘ indicates that the variable x is limited to 
values of type T; we can omit the superscript when the type is clear. Moreover, for 
each term T°’? and each term s° we have a term T(s) of type p. We will see 
several examples below that clarify the way 1 notation is used. 

In function-based systems, rather than using set existence axioms, we use axioms 
that assert the existence of various higher order functions, which in this context are 
sometimes called combinators. For example: 


¢ For each pair of types o,T there is a combinator K,,; (also written X,,,) of 
type o — t — @ that allows us to form constant functions: 


Koyr(x%) = Ay™ Lx]. 
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* For each triple of types p, a, 7 there is a combinator S,,,-,7 of type 
(p> 0 > T1) > (p70) > (p> 7). 


To understand this combinator, use Currying to view a function f7 7 (x?) as if 
were a function f* (x?, y7). Then, given a function g: p — o, we can form 


Sp.o.2(f.8) = Ax? [f(x 8(x))]. 
¢ For each type T, we have a combinator R; of type 
a->(0-~a07>0)>0-> 0. 


We view R, as a higher order functional of the form R(f%, g°°?7 77, n°). The 
axioms for this combinator allow us to define functions by recursion: 


R(f°, 8,0) =f; 
R(f°,g,n+1) =g(n,R(f,g,n)). 


Definition 12.4.2. The system E-PRA® is a formal system in many-sorted classical 
logic in the language of arithmetic in all finite types, along with: 


* Terms and defining axioms forK,,; and S,,-,r, for all types p, 7, T; 

e A term and defining axioms for the combinator Ro, that is, the R combinator in 
type 0 only; 

e Terms for the number 0, and terms and defining axioms for the successor, 
addition, and multiplication operations, and the order relation. These axioms 
state that w is an ordered semiring; 

¢ For all types p and T, an axiom of extensionality 


(Wx?, yP, POT) [x =p y > 2(x) =r 2()I; 


¢ The axiom scheme for quantifier-free induction. For each quantifier-free formula 
Ao this scheme includes 


(Ao(0) A (Wx) [Ao(x) > Ao(x + 1)]) > (Vx) Ao(a). 


Definition 12.4.3. In the language of E-PRA”, the principle of quantifier free choice, 
QF-AC”’", is the axiom scheme containing each formula of the form 


(Vx?) (Ay") Ao(x, y) > (AY? °*) (Wx?) Ao(x, Y(x)) 
where Apo is quantifier-free and may have parameters. 


The quantifier free choice principle QF-AC”° can be used to construct many 
useful functions of type 1. For example, given m®, for each x° there is a y° such that 
y = Oand x = m, or y = 1 and x # m. Hence there is a function Eq,,,(x) so that 
Eq,,(x) = 0 if x = m and Eq,,, (x) = 1 otherwise. 
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We now have enough definitions to define the standard base systems for higher 
order reverse mathematics 


Definition 12.4.4. The system RCA” is defined to be E-PRA® + QF-AC!-°. The 
system RCA; is defined to consist of E-PRA” (the second order part of E-PRA“) 
along with QF-AC”°. 


Although the definition only includes quantifier-free induction, RCA}’ proves x? 
induction, using QF-AC”° (Exercise 12.5.9). The following result is Exercise 12.5.10. 


Proposition 12.4.5 (see Kohlenbach [185]).. The systems RCAg and RCA; are 
bi-interpretable. 


Corollary 12.4.6. RCAG has an w-model in which the set of objects of type | is 
exactly the set of computable functions. 


As with second order reverse mathematics, higher order reverse mathematics 
results will show that a particular statement is provable from, or equivalent to, a 
function existence principle in higher order arithmetic. One key function principle 
is (2?): 

(3°): (BE*)(Wf)(E(f) = 0 © (Ax°)[ f(a) = OD). 
The strength of principles such as 3* comes from the ability to combine them with 
other functions and combinators. 


Example 12.4.7. RCA®’ + (3°) proves that every function g°~° has a range. In this 
setting, a set is represented by its characteristic function. Given a function g!, in 
RCA,” we can form the function 


h(m°, x°) = 0 if g(x) = m, 

1 otherwise. 
Thus h(m) = Ax°.h(m,x) is type 1 for each m. Then r(m) = 1 + E?(h(m)) is the 
characteristic function of the range of g. 


The example suggests that RCA?’ + (3°) can serve as a higher order version of 
ACAo; sometimes this system is denoted ACA;’. The statements shown equivalent to 


(3°) in the following result are genuinely third-order statements, and could not be 
stated in second order arithmetic. 


Proposition 12.4.8 (Kohlenbach [185]). The following are equivalent to (47) over 
RCA}: 


1. There exists afunction f : 8 — Rthat is not everywhere sequentially continuous. 
2. There is a function P: R — R such that 


P(x) = 0 ifx <0; 
O11 ¢x>0 
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Moving to a higher order setting also allows us to state principles without the 
countability restrictions inherent in second order arithmetic. For example, we saw a 
second order version of the Heine—Borel theorem in Theorem 10.5.5. Normann and 
Sanders [234] and Sanders [270] have studied the following higher order version of 
Heine—Borel theorem: 


(HBU): (V®: R > R*)(Ayo,..., yx € [0, 1]) (Vx € [0,1]) [x € J B(y;, B(y))). 


i<k 


Intuitively, this principle is considering an open cover of [0, 1] in which each point 
x is covered by the interval B(x, ®(x)); the conclusion is that [0,1] is contained in 
a finite number of these intervals. 


Proposition 12.4.9 (Normann and Sanders [234]). The principle HBU is provable 
from the system RCA? + (33), where 


(3°): GE) (Wx?) (E(x) = 0 «> (Sw*) [x(w) = 0]). 
Moreover, HBU is not provable in RCA® plus T;, comprehension, for any k. 


Sanders and Normann obtain similar results for a number of additional theorems 
of analysis, including versions of Cousin’s lemma, Lindel6f’s lemma, and theorems 
about the gauge integral. Sanders [272] has also used higher order arithmetic to study 
the coding inherent in second order reverse mathematics. 


Definition 12.4.10. The following definitions are made in RCA’. 


* A set of reals is coded by its characteristic function. If A, B are sets of reals, we 
can state A C B using the characteristic functions. 

* Aset A of reals is open if, foreach x € A, there is some r € Q* with B(x,r) ¢ A. 

eA set A C R is countable if there is a function F: R — N such that F [A is 
injective. 

¢ A set A C Ris second countable if there is a countable sequence of open balls 
such that every open covering of A can be written as a union of balls from the 
sequence. 


Proposition 12.4.11 (Sanders [271]). 


1. Over RCAQ’, (2?) is equivalent to the principle that every sequence (xn) of reals 
is a countable set. 


2. Over RCAo, the principle that the unit interval is second-countable implies (3). 


Higher order systems can also be used to examine the strength of our coding 
system for continuous functions, as in the following theorem. 


Proposition 12.4.12 (Kohlenbach [184]). E-PRA® +QF-AC!°+QF-AC”! does not 
prove that every continuous functional ©? (that is, NN — N) can be represented 
though a continuous function code as in Definition 10.3.3. 
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Constructive versions of higher order arithmetic can also be studied. These use 
a similar framework to E-PRA“®, using constructive logic (without the law of the 
excluded middle) rather than classical logic. Constructive reverse mathematics is 
typically carried out in informal systems, like ordinary constructive mathematics, 
but can be formalized into constructive systems of higher order arithmetic. 

Using methods from proof theory, constructive higher order systems can also be 
used to study the principle of the excluded middle [159] and formalized Weihrauch 
reducibility [161]. Uftring [315] studies formalized Weihrauch reducibility with a 
version of higher order arithmetic incorporating features from linear logic. 


12.5 Exercises 


Exercise 12.5.1. Prove Lemma 12.1.5. 


Exercise 12.5.2. Give a careful proof in ATRo of the existence of the set Z in 
Lemma 12.1.7. 


Exercise 12.5.3. Prove the following in RCAg. 


1. For all well orderings X, it is not the case that X — X. 
2. If x <x y, then there is no embedding of X<x¥] into X<x+1, 


Exercise 12.5.4. Prove Lemma 12.1.23. 
Exercise 12.5.5. Prove Lemma 12.1.17. 


Exercise 12.5.6 (Simpson [288]). Show that ACAg proves that if B N® is a Borel 
code and X € NN then the evaluation map for B at X (if it exists) is unique. 


Exercise 12.5.7. Prove Proposition 12.3.5. 


Exercise 12.5.8. Prove that every finite type is equivalent to a pure type. Moreover, 
deg(p — T) is max{deg(p) + 1, deg(r)}. 


Exercise 12.5.9. Show that the Par induction scheme is provable in RCA)’. 
Exercise 12.5.10. Prove Proposition 12.4.5. 


Exercise 12.5.11. Let U C w”® be an undetermined set. Use U to construct a set 
A C w® so that A is determined but its complement is not. 
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